freebsd-skq

Author	SHA1	Message	Date
alfred	e16a3900a7	All unimplemented VFS ops now have entries in kern/vfs_default.c that return reasonable defaults. This avoids confusing and ugly casting to eopnotsupp or making dummy functions. Bogus casting of filesystem sysctls to eopnotsupp() have been removed. This should make *_vfsops.c more readable and reduce bloat. Reviewed by: msmith, eivind Approved by: phk Tested by: Jeroen Ruigrok/Asmodai <asmodai@wxs.nl>	1999-09-07 22:42:38 +00:00
phk	d311a0563b	remove unused variables.	1999-08-28 19:21:03 +00:00
peter	3b842d34e8	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
phk	591c94d4c6	Simplify the handling of VCHR and VBLK vnodes using the new dev_t: Make the alias list a SLIST. Drop the "fast recycling" optimization of vnodes (including the returning of a prexisting but stale vnode from checkalias). It doesn't buy us anything now that we don't hardlimit vnodes anymore. Rename checkalias2() and checkalias() to addalias() and addaliasu() - which takes dev_t and udev_t arg respectively. Make the revoke syscalls use vcount() instead of VALIASED. Remove VALIASED flag, we don't need it now and it is faster to traverse the much shorter lists than to maintain the flag. vfs_mountedon() can check the dev_t directly, all the vnodes point to the same one. Print the devicename in specfs/vprint(). Remove a couple of stale LFS vnode flags. Remove unimplemented/unused LK_DRAINED;	1999-08-26 14:53:31 +00:00
peter	d4c0c0bd4a	Convert all the nfs macros to do { blah } while (0) to ensure it works correctly in if/else etc. egcs had probably picked up most of the problems here before with "ambiguous braces" etc, but this should increase the robustness a bit. Based on an idea from Eivind Eklund.	1999-08-19 14:50:12 +00:00
alc	075745f2e2	Add the (inline) function vm_page_undirty for clearing the dirty bitmask of a vm_page. Use it. Submitted by: dillon	1999-08-17 04:02:34 +00:00
dt	28a96b8235	nfs_getcacheblk() can return 0 if the mount is interruptible. It need to be checked by the caller. Broken in: rev. 1.70 (1999/05/02)	1999-08-12 18:04:39 +00:00
phk	e938d317d5	Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>, a few lines into <sys/vnode.h>. Add a few fields to struct specinfo, paving the way for the fun part.	1999-08-08 18:43:05 +00:00
peter	fce0d9b321	Don't over-allocate and over-copy shorter NFSv2 filehandles and then correct the pointers afterwards. It's kinda bogus that we generate a 24 (?) byte filehandle (2 x int32 fsid and 16 byte VFS fhandle) and pad it out to 64 bytes for NFSv3 with garbage. The whole point of NFSv3's variable filehandle length was to allow for shorter handles, both in memory and over the wire. I plan on taking a shot at fixing this shortly.	1999-08-04 14:41:39 +00:00
msmith	8d698b77fd	As described by the submitter: I did some tcpdumping the other day and noticed that GETATTR calls were frequently followed by an ACCESS call to the same file. The attached patch changes nfs_getattr to fill the access cache as a side effect. This is accomplished by calling ACCESS rather than GETATTR. This implies a modest overhead of 4 bytes in the request and 8 bytes in the response compared to doing a vanilla GETATTR. ... [The patch comprises two parts] The first is the "real" patch, the second counts misses and hits rather than fills and hits. The difference is subtle but important because both nfs_getattr and nfs_access now fill the cache. It also changes the default value of nfsaccess_cache_timeout to better match the attribute cache. IMHO, file timestamps change much more frequently than protection bits. Submitted by: Bjoern Groenvall <bg@sics.se> Reviewed by: dillon (partially)	1999-07-31 01:51:58 +00:00
wpaul	7f07913dd9	Close PR #12651 : the hash calculation routine has changed in other parts of the kernel but was not updated in nfs_readdirplusrpc().	1999-07-30 04:51:35 +00:00
wpaul	9bf69787ba	Fix two bugs in nfs_readdirplus(). The first is that in some cases, vnodes are locked and never unlocked, which leads to processes starting to wedge up after doing a mount -o nfsv3,tcp,rdirplus foo:/fs /fs; ls /fs. The second is that sometimes cnp is accessed without having been properly initialized: cnp->cn_nameptr points to an earlier name while "len" contains the length of a current name of different size. This leads to an attempt to dereference *(cn->cn_nameptr + len) which will sometimes cause a page fault and a panic. With these two fixes, client side readdirplus works correctly with FreeBSD, IRIX 6.5.4 and Solaris 2.5.1 and 2.6 servers. Submitted by: Matthew Dillon <dillon@backplane.com>	1999-07-30 04:02:04 +00:00
wpaul	a0ef521585	Correct the sanity test length calculation in nfsrv_readdirplus(): len is being incremented by 4 bytes too few each time through the loop, which allows more data into the mbuf chain that we really want. In the worst case, when we're using 32K read/write sizes with a TCP client, this causes readdirplus replies to sometimes exceed NFS_MAXPACKET which leads to a panic. This problem cropped up for me using an IRIX 6.5.4 NFSv3 TCP client with 32K read/write sizes, however supposedly it can be triggered by WinNT NFS servers too. In theory, it can probably be triggered by any NFS v3 implementation using TCP as long as it's using the maxiumum block size. Reviewed by: Matthew Dillon <dillon@backplane.com>	1999-07-29 21:42:57 +00:00
alc	1f3845a859	Clear error in nfsrv_create when we have a valid reply so that that reply is actually transmitted. Submitted by: dillon	1999-07-28 08:20:49 +00:00
phk	6c373ff516	I have not one single time remembered the name of this function correctly so obviously I gave it the wrong name. s/umakedev/makeudev/g	1999-07-17 18:43:50 +00:00
peter	9f31938811	Fix warning. va_fsid is udev_t, which is int32_t. No need to use %lx.	1999-07-01 13:32:54 +00:00
julian	b242252948	Submitted by: "David E. Cross" <crossd@cs.rpi.edu> Matt missed a line..	1999-06-30 04:29:13 +00:00
julian	0154a689b3	Submitted by: Conrad Minshall <conrad@apple.com> Reviewed by: Matthew Dillon <dillon@apollo.backplane.com> The following ugly hack to the exit path of nfs_readlinkrpc() circumvents an Auspex bug: for symlinks longer than 112 (0x70) they return a 1024 byte xdr string - the correct data with many nulls appended. Without this fix namei returns ENAMETOOLONG, at least it does on our source base and on FreeBSD 3.0. Note we do not (and should not) rely upon their null padding.	1999-06-30 02:53:51 +00:00
peter	5ecb2e0dad	Fix a KASSERT() that was negated and lead to: nfs_strategy: buffer 0xxxxx not locked when you attempted to write and had INVARIANTS turned on.	1999-06-28 12:34:40 +00:00
peter	6d9ab211eb	Minor tweaks to make sure (new) prerequisites for <sys/buf.h> (mostly splbio()/splx()) are #included in time.	1999-06-27 11:44:22 +00:00
mckusick	5b58f2f951	Convert buffer locking from using the B_BUSY and B_WANTED flags to using lockmgr locks. This commit should be functionally equivalent to the old semantics. That is, all buffer locking is done with LK_EXCLUSIVE requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will be done in future commits.	1999-06-26 02:47:16 +00:00
julian	7dc5713bf7	Matt's NFS fixes. Submitted by: Matt Dillon Reviewed by: David Cross, Julian Elischer, Mike Smith, Drew Gallatin 3.2 version to follow when tested	1999-06-23 04:44:14 +00:00
mjacob	63ed685667	Thanks to Bruce for noticing this.... compare against the new nfsnode's mount point for seeing whether or not the new nfsnode is already in the hash queue. We're pretty much guaranteed that the old nfsnode is already in the hash queue. Wank! Infinite Loop! Looks like just a minor typo.... (ah the influence of fortran ... np && np2... why not nfsnode_the_first && nfsnode_the_second???)...	1999-06-19 19:33:44 +00:00
mckusick	88e39a63db	Add a vnode argument to VOP_BWRITE to get rid of the last vnode operator special case. Delete special case code from vnode_if.sh, vnode_if.src, umap_vnops.c, and null_vnops.c.	1999-06-16 23:27:55 +00:00
mjacob	792005e7a3	Use vput instead of vrele. Reviewed by: Matthew Dillon <dillon@apollo.backplane.com> Submitted by: Ville-Pertti Keinonen <will@iki.fi> Obtained from: Matthew Dillon <dillon@apollo.backplane.com>	1999-06-16 18:35:58 +00:00
mjacob	06fe6c5ca3	If we retry this operation from the top of this routine, we need to make sure we've freed any allocated resources (to avoid a memory leak) and and do the right thing with respect to the nfs node hash lock we'd acquired.	1999-06-15 23:24:14 +00:00
peter	21732dea0c	Various changes lifted from the OpenBSD cvs tree: txdr_hyper and fxdr_hyper tweaks to avoid excessive CPU order knowledge. nfs_serv.c: don't call nfsm_adj() with negative values, windows clients could crash servers when doing a readdir of a large directory. nfs_socket.c: Use IP_PORTRANGE to get a priviliged port without a spin loop trying to bind(). Don't clobber a mbuf pointer or we get panics on a NFS3ERR_JUKEBOX error from a server when reusing a freed mbuf. nfs_subs.c: Don't loose st_blocks on NFSv2 mounts when > 2GB. Obtained from: OpenBSD	1999-06-05 05:35:03 +00:00
peter	06a6a667a6	Fix a malloc race Obtained from: OpenBSD (csapuntz)	1999-06-05 05:26:36 +00:00
peter	4864590454	Don't mistake a non-async block that needs to be committed for an interrupted write. Obtained from: fvdl@NetBSD.org via OpenBSD.	1999-06-05 05:25:37 +00:00
phk	7e26ca1d1a	Divorce "dev_t" from the "major\|minor" bitmap, which is now called udev_t in the kernel but still called dev_t in userland. Provide functions to manipulate both types: major() umajor() minor() uminor() makedev() umakedev() dev2udev() udev2dev() For now they're functions, they will become in-line functions after one of the next two steps in this process. Return major/minor/makedev to macro-hood for userland. Register a name in cdevsw[] for the "filedescriptor" driver. In the kernel the udev_t appears in places where we have the major/minor number combination, (ie: a potential device: we may not have the driver nor the device), like in inodes, vattr, cdevsw registration and so on, whereas the dev_t appears where we carry around a reference to a actual device. In the future the cdevsw and the aliased-from vnode will be hung directly from the dev_t, along with up to two softc pointers for the device driver and a few houskeeping bits. This will essentially replace the current "alias" check code (same buck, bigger bang). A little stunt has been provided to try to catch places where the wrong type is being used (dev_t vs udev_t), if you see something not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if it makes a difference. If it does, please try to track it down (many hands make light work) or at least try to reproduce it as simply as possible, and describe how to do that. Without DEVT_FASCIST I belive this patch is a no-op. Stylistic/posixoid comments about the userland view of the <sys/*.h> files welcome now, from userland they now contain the end result. Next planned step: make all dev_t's refer to the same devsw[] which means convert BLK's to CHR's at the perimeter of the vnodes and other places where they enter the game (bootdev, mknod, sysctl).	1999-05-11 19:55:07 +00:00
phk	f57a01ebfc	remove b_proc from struct buf, it's (now) unused. Reviewed by: dillon, bde	1999-05-06 20:00:34 +00:00
peter	73556bfee1	Add sufficient braces to keep egcs happy about potentially ambiguous if/else nesting.	1999-05-06 18:13:11 +00:00
alc	7693bdc245	All directory accesses must be made with NFS_DIRBLKSIZE chunks to avoid confusing the directory read cookie cache. The nfs_access implementation for v2 mounts attempts to read from the directory if root is the user so that root can't access cached files when the server remaps root to some other user. Submitted by: Doug Rabson <dfr@nlsystems.com> Reviewed by: Matthew Dillon <dillon@apollo.backplane.com>	1999-05-03 20:59:14 +00:00
alc	5cb08a2652	The VFS/BIO subsystem contained a number of hacks in order to optimize piecemeal, middle-of-file writes for NFS. These hacks have caused no end of trouble, especially when combined with mmap(). I've removed them. Instead, NFS will issue a read-before-write to fully instantiate the struct buf containing the write. NFS does, however, optimize piecemeal appends to files. For most common file operations, you will not notice the difference. The sole remaining fragment in the VFS/BIO system is b_dirtyoff/end, which NFS uses to avoid cache coherency issues with read-merge-write style operations. NFS also optimizes the write-covers-entire-buffer case by avoiding the read-before-write. There is quite a bit of room for further optimization in these areas. The VM system marks pages fully-valid (AKA vm_page_t->valid = VM_PAGE_BITS_ALL) in several places, most noteably in vm_fault. This is not correct operation. The vm_pager_get_pages() code is now responsible for marking VM pages all-valid. A number of VM helper routines have been added to aid in zeroing-out the invalid portions of a VM page prior to the page being marked all-valid. This operation is necessary to properly support mmap(). The zeroing occurs most often when dealing with file-EOF situations. Several bugs have been fixed in the NFS subsystem, including bits handling file and directory EOF situations and buf->b_flags consistancy issues relating to clearing B_ERROR & B_INVAL, and handling B_DONE. getblk() and allocbuf() have been rewritten. B_CACHE operation is now formally defined in comments and more straightforward in implementation. B_CACHE for VMIO buffers is based on the validity of the backing store. B_CACHE for non-VMIO buffers is based simply on whether the buffer is B_INVAL or not (B_CACHE set if B_INVAL clear, and vise-versa). biodone() is now responsible for setting B_CACHE when a successful read completes. B_CACHE is also set when a bdwrite() is initiated and when a bwrite() is initiated. VFS VOP_BWRITE routines (there are only two - nfs_bwrite() and bwrite()) are now expected to set B_CACHE. This means that bowrite() and bawrite() also set B_CACHE indirectly. There are a number of places in the code which were previously using buf->b_bufsize (which is DEV_BSIZE aligned) when they should have been using buf->b_bcount. These have been fixed. getblk() now clears B_DONE on return because the rest of the system is so bad about dealing with B_DONE. Major fixes to NFS/TCP have been made. A server-side bug could cause requests to be lost by the server due to nfs_realign() overwriting other rpc's in the same TCP mbuf chain. The server's kernel must be recompiled to get the benefit of the fixes. Submitted by: Matthew Dillon <dillon@apollo.backplane.com>	1999-05-02 23:57:16 +00:00
phk	ca21a25f17	This Implements the mumbled about "Jail" feature. This is a seriously beefed up chroot kind of thing. The process is jailed along the same lines as a chroot does it, but with additional tough restrictions imposed on what the superuser can do. For all I know, it is safe to hand over the root bit inside a prison to the customer living in that prison, this is what it was developed for in fact: "real virtual servers". Each prison has an ip number associated with it, which all IP communications will be coerced to use and each prison has its own hostname. Needless to say, you need more RAM this way, but the advantage is that each customer can run their own particular version of apache and not stomp on the toes of their neighbors. It generally does what one would expect, but setting up a jail still takes a little knowledge. A few notes: I have no scripts for setting up a jail, don't ask me for them. The IP number should be an alias on one of the interfaces. mount a /proc in each jail, it will make ps more useable. /proc/<pid>/status tells the hostname of the prison for jailed processes. Quotas are only sensible if you have a mountpoint per prison. There are no privisions for stopping resource-hogging. Some "#ifdef INET" and similar may be missing (send patches!) If somebody wants to take it from here and develop it into more of a "virtual machine" they should be most welcome! Tools, comments, patches & documentation most welcome. Have fun... Sponsored by: http://www.rndassociates.com/ Run for almost a year by: http://www.servetheweb.com/	1999-04-28 11:38:52 +00:00
phk	16e3fbd2c1	Suser() simplification: 1: s/suser/suser_xxx/ 2: Add new function: suser(struct proc ), prototyped in <sys/proc.h>. 3: s/suser_xxx($[a-zA-Z0-9_]$->p_ucred, \&\1->p_acflag)/suser(\1)/ The remaining suser_xxx() calls will be scrutinized and dealt with later. There may be some unneeded #include <sys/cred.h>, but they are left as an exercise for Bruce. More changes to the suser() API will come along with the "jail" code.	1999-04-27 11:18:52 +00:00
dt	9b8660ce53	Fixed printf format errors on alpha.	1999-04-24 11:29:48 +00:00
peter	2779806507	Close a potential mbuf and/or mbuf cluster leak in the client-side NFS statfs() code. Free the whole chain, not just the first one.	1999-04-10 18:53:29 +00:00
peter	907d1de4fb	Hold nfsd's upages in-core with PHOLD rather than P_NOSWAP.	1999-04-06 03:07:54 +00:00
julian	0ed09d2ad5	Catch a case spotted by Tor where files mmapped could leave garbage in the unallocated parts of the last page when the file ended on a frag but not a page boundary. Delimitted by tags PRE_MATT_MMAP_EOF and POST_MATT_MMAP_EOF, in files alpha/alpha/pmap.c i386/i386/pmap.c nfs/nfs_bio.c vm/pmap.h vm/vm_page.c vm/vm_page.h vm/vnode_pager.c miscfs/specfs/spec_vnops.c ufs/ufs/ufs_readwrite.c kern/vfs_bio.c Submitted by: Matt Dillon <dillon@freebsd.org> Reviewed by: Alan Cox <alc@freebsd.org>	1999-04-05 19:38:30 +00:00
julian	f27c95753f	Reviewed by: Many at differnt times in differnt parts, including alan, john, me, luoqi, and kirk Submitted by: Matt Dillon <dillon@frebsd.org> This change implements a relatively sophisticated fix to getnewbuf(). There were two problems with getnewbuf(). First, the writerecursion can lead to a system stack overflow when you have NFS and/or VN devices in the system. Second, the free/dirty buffer accounting was completely broken. Not only did the nfs routines blow it trying to manually account for the buffer state, but the accounting that was done did not work well with the purpose of their existance: figuring out when getnewbuf() needs to sleep. The meat of the change is to kern/vfs_bio.c. The remaining diffs are all minor except for NFS, which includes both the fixes for bp interaction AND fixes for a 'biodone(): buffer already done' lockup. Sys/buf.h also contains a chaining structure which is not used by this patchset but is used by other patches that are coming soon. This patch deliniated by tags PRE_MAT_GETBUF and POST_MAT_GETBUF. (sorry for the missing T matt)	1999-03-12 02:24:58 +00:00
peter	2cb5b6d7d5	Untangle the nfs send and receive queue locking a little. One lock routine was [ab]used for two different things, and you couldn't tell from the wait channel which one had wedged. Catch a few things missing from NFS_NOSERVER.	1999-02-25 00:03:51 +00:00
dfr	0f4e134c5f	Move the declaration of the vfs.nfs sysctl node outside an ifdef so that it builds if NFS_NOSERVER is defined. Spotted by: Bruce Evans <bde@zeta.org.au>	1999-02-18 09:19:41 +00:00
bde	9cf8505943	Fixed bitrot in NFS_ACDEBUG option.	1999-02-17 13:59:29 +00:00
dfr	22ceb237f0	* Change sysctl from using linker_set to construct its tree using SLISTs. This makes it possible to change the sysctl tree at runtime. * Change KLD to find and register any sysctl nodes contained in the loaded file and to unregister them when the file is unloaded. Reviewed by: Archie Cobbs <archie@whistle.com>, Peter Wemm <peter@netplex.com.au> (well they looked at it anyway)	1999-02-16 10:49:55 +00:00
dillon	717990555d	General additional cleanup of VOP API for NFS ops - mainly NFS ignoring the API for freeing up cnp's. This cleanup should not effect nominal operation one way or the other since NFS VOPs just happen to be called with flags that match what it actually does to the NAMEI components it gets. Still, if an NFS error occured, there was probably some memory leakage of NAMEI components with certain NFS VOP ops.	1999-02-13 09:47:30 +00:00
dillon	ab62b188ab	PR: kern/9970 Remove incorrect vput() in nfs_link()	1999-02-13 08:01:59 +00:00
dillon	42bebed095	Flush delayed-write data out prior to issuing a rename rpc. This appears to fix the problem w/ NFSV3 whereby a make installworld would get into high-network-bandwidth situations continuously trying to retry nfs writes that fail with a 'stale file handle' error.	1999-02-06 07:48:56 +00:00
dillon	ca558df378	Fix warnings related to -Wall -Wcast-qual	1999-01-28 17:32:05 +00:00
dillon	975fba8a24	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile	1999-01-28 00:57:57 +00:00
dillon	f9a4729a9b	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile. This commit includes significant work to proper handle const arguments for the DDB symbol routines.	1999-01-27 23:45:44 +00:00
dillon	afb6772e77	Fix nasty bug in nfs_access(). A conditional was if (a = b) instead of if (a == b).	1999-01-27 22:45:49 +00:00
dillon	72e4bdcf94	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile	1999-01-27 22:45:13 +00:00
dillon	dbf5cd2b57	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile	1999-01-27 22:42:27 +00:00
dillon	df24433bbe	This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues. Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>	1999-01-21 08:29:12 +00:00
eivind	2f82e08ebd	Remove two cases of unused variable sp3.	1999-01-12 12:39:14 +00:00
eivind	ffaaca5874	Remove the 'waslocked' parameter to vfs_object_create().	1999-01-05 18:50:03 +00:00
hoek	5e720f3594	Silence -Wtrigraph. Submitted by: Bradley Dunn <bradley@dunn.org> (pr: kern/8817)	1998-12-30 00:37:44 +00:00
dfr	1cee4d444d	Fix for creating files on a Solaris 7 server with NFSv3 (the request was slightly garbled but older servers seemed to understand it). Reviewed by: David O'Brien <obrien@nuxi.ucdavis.edu>	1998-12-25 10:34:27 +00:00
dt	7212f6ac0c	Added 3 new errno values, requred by various standards: EOVERFLOW, ECANCELED, EILSEQ. Fixed ibcs2 and especially linux EIDRM and ENOMSG errno mapping. Reviewed by: Dan Nelson <dnelson@emsphone.com>	1998-12-14 18:54:04 +00:00
dt	9c55aebaa4	(Hopefully) fix support for "large" files. Mostly cast block numbers to off_t before they multiplied to block sizes.	1998-12-14 17:51:30 +00:00
eivind	56b8c7c844	Remove the if fixed in the last commit; bde quite correctly point out that it can never fail.	1998-12-09 15:12:53 +00:00
eivind	d74ab3f9f8	Fix typo (; in "if (vp == NULL);").	1998-12-08 23:11:24 +00:00
archie	60d13c7a9d	The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static and local variables, goto labels, and functions declared but not defined.	1998-12-07 21:58:50 +00:00
archie	982e80577d	Examine all occurrences of sprintf(), strcat(), and str[n]cpy() for possible buffer overflow problems. Replaced most sprintf()'s with snprintf(); for others cases, added terminating NUL bytes where appropriate, replaced constants like "16" with sizeof(), etc. These changes include several bug fixes, but most changes are for maintainability's sake. Any instance where it wasn't "immediately obvious" that a buffer overflow could not occur was made safer. Reviewed by: Bruce Evans <bde@zeta.org.au> Reviewed by: Matthew Dillon <dillon@apollo.backplane.com> Reviewed by: Mike Spengler <mks@networkcs.com>	1998-12-04 22:54:57 +00:00
dillon	dd581b71fb	Make bootp error message slightly more verbose	1998-12-03 20:28:23 +00:00
msmith	6461f7d9c1	Reimplement the NFS ACCESS RPC cache as an "accelerator" rather than a true cache. If the cached result lets us say "yes", then go with that. If we're not sure, or we think the answer might be "no", go to the wire to be certain. This avoids all of the possible false veto cases, and allows us to key the cached value with just the UID for which the cached value holds, reducing the bloat of the nfsnode structure from 104 bytes to just 12 bytes. Since the "yes" case is by far the most common, this should still provide a substantial performance improvement. Also default the cache to on, with a conservative timeout (2 seconds). This improves performance if NFS is loaded as a KLD module, as there's not (yet) code to parse an option out of the module arguments to set it, and sysctl doesn't work (yet) for OIDs in modules. The 'accelerator' mode was suggested by Bjoern Groenvall (bg@sics.se) Feedback on this would be appreciated as testing has been necessarily limited by Comdex, and it would be valuable to have this in 2.2.8.	1998-11-15 20:36:18 +00:00
msmith	5a28cf0283	Avoid a null pointer reference if the target of an NFS rename has been sillrenamed, or if the source vnode doesn't have an associated nfsnode. Bug report from Andrew Gallatin <gallatin@cs.duke.edu>	1998-11-13 22:58:48 +00:00
dfr	f8c57dfec2	Fix a panic in nfsrv_dorec() where a NULL pointer could be passed to free() sometimes. Reviewed by: Eric Haug <ejh@eas.slu.edu>	1998-11-13 09:44:12 +00:00
msmith	5ffcaa816e	Implement NFS ACCESS RPC result caching. This yields startling performance increases for NFS clients for many access profiles, due to the fact that ACCESS results are persistently cached in the namecache in many cases. Note that the code is somewhat conservative in that it requires an exact credential match for a cache hit. This bloats the nfsnode structure by sizeof(struct ucred) (96 bytes). Any less conservative approach opens the possibility for a false veto in eg. setuid applications. Alternative suggestions would be welcomed. The cache is normally disabled, to activate set the sysctl variable vfs.nfs.access_cache_timeout to a nonzero value. This is the time in seconds that a cached entry will be considered valid; useful values appear to be 2-10 seconds. Performance of the cache can be monitored with the vfs.nfs.access_cache_hits and vfs.nfs.access_cache_hits variables.	1998-11-13 02:39:09 +00:00
peter	4dde2ece8b	Remove [apparently] bogus casts to u_long for the vnode_pager_setsize() second argument. np_size is a 64 bit int, so is the second arg. This might have caused needless 2G/4G file size problems. I believe it was Bruce who queried this.	1998-11-09 07:00:14 +00:00
peter	052ed056ae	vm_object_page_clean() last arg changed from TRUE to OBJPC_SYNC. I'm not sure that this is necessary to be a sync write here since a VOP_FSYNC() follows and it will schedule, sort and complete the writes that the vm_object_page_clean() started (as I think I understand things).	1998-10-31 15:39:31 +00:00
peter	8ef35acf90	Use TAILQ macros for clean/dirty block list processing. Set b_xflags rather than abusing the list next pointer with a magic number.	1998-10-31 15:31:29 +00:00
mckusick	79fbc60c6a	In nfs_link(), check for a cross-device mount before looking in the v_data field. Obtained from: Charles Hannum, via Frank van der Linden <frank@wins.uva.nl>	1998-09-29 23:39:37 +00:00
mckusick	74f40b1c41	Missing vput when cross-device link error is detected in nfs_link.	1998-09-29 23:29:48 +00:00
mckusick	17402e8897	During truncation, have to notify the VM about the new size of the NFS file before doing the nfs_vinvalbuf operation. Otherwise some invalid data may show up in an mmap.	1998-09-29 23:28:32 +00:00
mckusick	a57013de62	Frank sez: 'It fixes a problem with servers that return 0 values for some of the fsinfo RPC fields. It is strictly speaking not wrong to do this, as the spec says that "it is expected that a server will make a best effort at supporting all the attributes", but pretty unusual. You guessed it, it's NT servers that do it.' Obtained from: Frank van der Linden <frank@wins.uva.nl>	1998-09-29 23:15:53 +00:00
mckusick	44e40659a6	Do not need (or want) to take a reference on an NFS file that is being deleted due to an forcible unmount. The problem is that vgone calls vclean() which then calls calls nfs_inactive() with VXLOCK set on the vnode. Nfs_inactive() was calling vget() to get a reference on the vnode, which in turn hung on VXLOCK. Nfs_inactive() now checks v_usecount to make sure that the vnode is not coming from vclean() before it does a vget().	1998-09-29 23:15:25 +00:00
mckusick	a037faba69	The code checks each fragment mark to see if it's valid; if the fragment is less than NFS_MINPACKET or greater than NFS_MAXPACKET in size, it barfs and, I think, drops the connection. However, there's no guarantee that in a multi-fragment RPC, all the fragments will be at least as large as NFS_MINPACKET. In fact, with the version of "tclnfs" we have here, which supports NFS over TCP, at least when built under SunOS 4.1.3 (i.e., with 4.1.3's user-mode ONC RPC library), I can repeatably cause "tclnfs" to send a request with more than one fragment, one of which is only 8 bytes long. I just do a 3877-byte write to a file, at an offset of 0. The check that "slp->ns_reclen" is greater than or equal to NFS_MINPACKET serves no useful purpose - if the NFS server code can't handle packets < NFS_MINPACKET bytes, it can't handle them over any protocol, so the check has to be done above the RPC-over-TCP layer - and should be removed. Obtained from: Fix from Guy Harris, forwarded by Rick Macklem.	1998-09-29 22:33:05 +00:00
mckusick	96a4dc4d6d	Mark directory buffers that have no valid data with B_INVAL so that they are not put in the cache.	1998-09-29 22:01:10 +00:00
mckusick	7774c57b8b	When adding data to a buffer, we need to clear the B_NEEDCOMMIT flag which says that the data is on server but not committed.	1998-09-29 21:46:54 +00:00
bde	e170b2ba75	Removed statically configured mount type numbers (MOUNT_) and all references to them. The change a couple of days ago to ignore these numbers in statically configured vfsconf structs was slightly premature because the cd9660, cfs, devfs, ext2fs, nfs vfs's still used MOUNT_ instead of the number in their vfsconf struct.	1998-09-07 13:17:06 +00:00
bde	3adc4cd6e2	Made unloading of the nfs LKM sort of work. This is mainly to test detachment of vfs sysctls. Unloading of vfs LKMs doesn't actually work for any vfs, since it leaves garbage pointers to memory allocation control structures.	1998-09-07 05:42:15 +00:00
bde	0f44756d5a	Ignore the statically configured vfs type numbers and assign vfs type numbers in vfs attach order (modulo incomplete reuse of old numbers after vfs LKMs are unloaded). This requires reinitializing the sysctl tree (or at least the vfs subtree) for vfs's that support sysctls (currently only nfs). sysctl_order() already handled reinitialization reasonably except it checked for annulled self references in the wrong place. Fixed sysctls for vfs LKMs.	1998-09-05 17:13:28 +00:00
bde	a84a2dedfc	Instantiate `nfs_mount_type' in a standard file so that it is present when nfs is an LKM. Declare it in a header file. Don't forget to use it in non-Lite2 code. Initialize it to -1 instead of to 0, since 0 will soon be the mount type number for the first vfs loaded. NetBSD uses strcmp() to avoid this ugly global.	1998-09-05 15:17:34 +00:00
dfr	e2df972eb1	Cosmetic changes to the PAGE_XXX macros to make them consistent with the other objects in vm.	1998-09-04 08:06:57 +00:00
luoqi	6579ddcc26	Check for NULL pointer before freeing a struct sockaddr. m_freem() can handle NULL, buf free() can't.	1998-09-01 02:31:52 +00:00
wollman	a76fb5eefa	Yow! Completely change the way socket options are handled, eliminating another specialized mbuf type in the process. Also clean up some of the cruft surrounding IPFW, multicast routing, RSVP, and other ill-explored corners.	1998-08-23 03:07:17 +00:00
bde	09bd4b9603	Fixed printf format errors.	1998-08-18 00:32:50 +00:00
dfr	cb85cf3e66	Protect all modifications to v_numoutput with splbio().	1998-08-13 08:09:08 +00:00
bde	92b68e1a4b	Don't configure compatibility code for pre-Lite2 mount() calls by default. This code should go away soon.	1998-08-12 20:17:42 +00:00
peter	6adfc2e8bc	If we get an ENOBUFS from the network, it's normally transient network interface congestion (eg: nfs over a ppp link, etc). Don't log these for UDP mounts, and don't cause syscalls to fail with EINTR. This stops the 'nfs send error 55' warnings. If the error is because the system is really hosed, this is the least of your problems...	1998-08-01 09:04:02 +00:00
bde	863d5c8b68	Cast pointers to uintptr_t/intptr_t instead of to u_long/long, respectively. Most of the longs should probably have been u_longs, but this changes is just to prevent warnings about casts between pointers and integers of different sizes, not to fix poorly chosen types.	1998-07-15 02:32:35 +00:00
dfr	0787c79732	Use u_int32_t in NQFHHASH instead of u_long.	1998-07-05 10:13:22 +00:00
julian	4363221ba2	VOP_STRATEGY grows an (struct vnode *) argument as the value in b_vp is often not really what you want. (and needs to be frobbed). more cleanups will follow this. Reviewed by: Bruce Evans <bde@freebsd.org>	1998-07-04 20:45:42 +00:00
kato	b262a6b558	Moved `#ifndef NFS_NOSERVER' after including nfs.h.	1998-07-02 12:41:42 +00:00
jmg	c8ef0cb9cd	fix buildworld hopefully be3fore anyone complains... NFS_*TIMO should possibly be converted to sysctl vars (jkh's suggestion), but in some cases it looks like nfs keeps a copy of the value in a struct hash sizes are already ifdef'd KERNEL, so there aren't userland inpact from them...	1998-06-30 11:19:22 +00:00
jmg	0e50288276	convert some nfs tunables to options, these are: NFS_MINATTRTIMO VREG attrib cache timeout in sec NFS_MAXATTRTIMO NFS_MINDIRATTRTIMO VDIR attrib cache timeout in sec NFS_MAXDIRATTRTIMO NFS_GATHERDELAY Default write gather delay (msec) NFS_UIDHASHSIZ Tune the size of nfssvc_sock with this NFS_WDELAYHASHSIZ and with this NFS_MUIDHASHSIZ Tune the size of nfsmount with this NFS_NOSERVER (already documented in LINT) NFS_DEBUG turn on NFS debugging also, because NFS_ROOT is used by very different files, it has been renamed to opt_nfsroot.h instead of the old opt_nfs.h....	1998-06-30 03:01:37 +00:00
bde	193dd07396	Fixed typo in ifdefed code. (NFS_ACDEBUG is not in LINT. Therefore, code controlled by it did not even compile.)	1998-06-21 12:50:12 +00:00
bde	a90040b583	Avoid an egcs pessimization for 64-bit signed division on i386's. Pre-2.8 versions of gcc generate a call to __divdi3() for all 64-bit signed divisions, but egcs optimizes them to a shift and fixup when the divisor is a constant power of 2. Unfortunately, it generates a call to __cmpdi2() for the fixup, although all except possibly ancient versions of gcc and egcs do ordinary 64-bit comparisons inline.	1998-06-14 15:52:00 +00:00
dfr	1d5f38ac22	This commit fixes various 64bit portability problems required for FreeBSD/alpha. The most significant item is to change the command argument to ioctl functions from int to u_long. This change brings us inline with various other BSD versions. Driver writers may like to use (__FreeBSD_version == 300003) to detect this change. The prototype FreeBSD/alpha machdep will follow in a couple of days time.	1998-06-07 17:13:14 +00:00
peter	dec84bd443	Make sure we go a nfs_fsinfo() in get/putpages before calling readrpc/writerpc, since they assume it's already been done. This could break if the first read/write access to a nfs filesystem was an exec() or mmap() instead of a read(), write() syscall. (or statfs()). nfs_getpages() could return an errno (EOPNOTSUPP) instead of a VM_PAGER_* return code. Some layout tweaks for the get/putpages code.	1998-06-01 11:32:53 +00:00
peter	21629df4dd	Fix post-test pre-commit cleanup typo.	1998-06-01 11:07:16 +00:00
peter	4178bd0c41	readlink() returns EINVAL rather than EPERM if called on a non-symlink.	1998-06-01 10:59:23 +00:00
peter	aca365c43e	Preset the maximum file size before we get to nfs_fsinfo(), based on an (over?) conservative assumption about what the client can store in it's buffer cache using a signed 32-bit 512-byte block number index. Otherwise it's possible for some file access when maxfilesize = 0 (eg: /usr is nfs mounted and doing an execve()) Pointed out by: bde XXX It might make sense to do a preemptive nfs_fsinfo() call at mount time.	1998-06-01 10:01:31 +00:00
peter	66d2475e6f	Hide more kernel stuff from userland. This stops nethostaddr etc being wanted by mount_nfs.c.	1998-06-01 07:23:26 +00:00
peter	19ad2aa63b	For the on-the-wire protocol, u_long -> u_int32_t; long -> int32_t; int -> int32_t; u_short -> u_int16_t. Also, use mode_t instead of u_short for storing modes (mode_t is a u_int16_t). Obtained from: NetBSD	1998-05-31 20:09:01 +00:00
peter	401c250cc4	Support 'mount -u' remounts. This may require disconnecting and rebinding the socket. Certain mode changes are not allowed. Obtained from: NetBSD	1998-05-31 19:49:31 +00:00
peter	5080277e0e	Cut-n-paste glitch	1998-05-31 19:43:34 +00:00
peter	d9c0dc4a94	xdr encode -1 properly. Obtained from: NetBSD	1998-05-31 19:29:28 +00:00
peter	2a5188e78c	Fully fill in nfsv2 write rpc requests rather than leaving garbage. Obtained from: NetBSD	1998-05-31 19:28:15 +00:00
peter	da4830ce17	Don't silently fail to set file flags. Obtained from: NetBSD	1998-05-31 19:24:19 +00:00
peter	a378822d22	Don't blindly accept the server's preferences if they are too small. Obtained from: NetBSD	1998-05-31 19:20:44 +00:00
peter	21746bb862	Prototype support for selectively allowing non-reserved ports on a per export basis. Needs userland support yet. Obtained from: NetBSD	1998-05-31 19:16:08 +00:00
peter	7966818099	Hide whiteouts from NFS, since the protocol doesn't support them. Obtained from: NetBSD	1998-05-31 19:10:52 +00:00
peter	49e79dfe9e	NetBSD has a comment that Solaris 2.5 doesn't do verifiers correctly, we have weakened this test already for Digital Unix, so it may be enough for Solaris. It needs to be checked again. Obtained from: NetBSD	1998-05-31 19:07:47 +00:00
peter	a21fad22e0	Don't pass a second copy of the uid/gid in with the v2/v3 sattr structures, it just makes more work. We pass a copy of the uid/gid with the credentials. (although, this may need to be revisited if a non AUTHUNIX authentication method (such as NFSKERB) ever gets implemented). Obtained from: NetBSD	1998-05-31 19:00:19 +00:00
peter	985cae8566	Use the new SB_UPCALL flag, Obtained from: NetBSD (but I changed the flag clear order in case).	1998-05-31 18:46:06 +00:00
peter	c4805fc7a0	NFS_SMALLFH is defined in nfsproto.h, not sys/mount.h Obtained from: NetBSD	1998-05-31 18:32:23 +00:00
peter	1f34203061	Don't let the user try "rmdir ." Obtained from: NetBSD	1998-05-31 18:30:42 +00:00
peter	f61450c5b0	Don't let the user try and unlink() a directory on a NFS server. Obtained from: NetBSD	1998-05-31 18:28:45 +00:00
peter	c45767477f	When a write rpc returns an error, break the loop. Obtained from: NetBSD	1998-05-31 18:27:07 +00:00
peter	7d869157f3	Don't leak an mbuf when a write rpc returns zero bytes written. Obtained from: NetBSD	1998-05-31 18:25:32 +00:00
peter	feb54238cc	#ifdef a diagnostic printf Obtained from: NetBSD	1998-05-31 18:23:24 +00:00
peter	c50a18d361	Don't try and free mrep twice on some error conditions. Obtained from: NetBSD	1998-05-31 18:19:43 +00:00
peter	66a3e6b96c	#ifdef a diagnostic panic, plus another missed costmetic change. Obtained from: NetBSD	1998-05-31 18:11:03 +00:00
peter	32e00d316d	We have gained 2 more errno's, add them to the NFSv2 mapping table.	1998-05-31 18:09:18 +00:00
peter	87e3e1a54b	Missed a cosmetic change that the other BSD's have.	1998-05-31 18:08:09 +00:00
peter	e410a1b026	oops, nfs_msg() is called from client code too.	1998-05-31 18:06:07 +00:00
peter	35835ef239	When we can't reconnect a socket, don't forget to unlock before retrying or we can deadlock. Obtained from: NetBSD	1998-05-31 18:02:56 +00:00
peter	7f449d8699	Don't log zero length reads, this can happen during normal operation. Obtained from: NetBSD	1998-05-31 18:00:46 +00:00
peter	7246bc5193	Consider for readdir chunk sizes when tuning socket buffer reservations. Obtained from: NetBSD	1998-05-31 17:57:43 +00:00
peter	2b239be950	Refuse READDIR / READDIRPLUS rpc's for non-directories Obtained from: NetBSD	1998-05-31 17:54:18 +00:00
peter	cbeeaf83f2	Some const's Obtained from: NetBSD	1998-05-31 17:48:07 +00:00
peter	e58631da3c	NFS Jumbo commit part 1. Cosmetic and structural changes only. The aim of this part of commits is to minimize unnecessary differences between the other NFS's of similar origin. Yes, there are gratuitous changes here that the style folks won't like, but it makes the catch-up less difficult.	1998-05-31 17:27:58 +00:00
peter	c8c505e6c0	VOP_ABORTUP() appears to be called with the wrong vnode. The other callers that I checked (eg: ufs_link()) do the ABORTOP on the directory rather than the file itself. After Michael Hancock's patches, the abortop doesn't seem all that critial now since something else will free the pathname buffer.	1998-05-31 01:03:07 +00:00
peter	aa33f20993	When using NFSv3, use the remote server's idea of the maximum file size rather than assuming 2^64. It may not like files that big. :-) On the nfs server, calculate and report the max file size as the point that the block numbers in the cache would turn negative. (ie: 1099511627775 bytes (1TB)). One of the things I'm worried about however, is that directory offsets are really cookies on a NFSv3 server and can be rather large, especially when/if the server generates the opaque directory cookies by using a local filesystem offset in what comes out as the upper 32 bits of the 64 bit cookie. (a server is free to do this, it could save byte swapping depending on the native 64 bit byte order) Obtained from: NetBSD	1998-05-30 16:33:58 +00:00
peter	6d06da8101	Convert a couple of large allocations to use zones rather than malloc for better packing. This means that we can choose better values for the various hash entries without having to try and get it all to fit within an artificial power of two limit for malloc's sake.	1998-05-24 14:41:56 +00:00
peter	ef0bb32854	Only ignore "owner" permissions selectively rather than always. In some cases we ignore it (eg: read/write) to maintain chmod-after-open semantics but in other cases we do care, eg: creating files, access() etc. Never ignore errors from VOP_ACCESS() on immutable files. This apparently comes from BSDI (from Keith Bostic) via NetBSD. PR: 5148 Submitted by: Yoshiro MIHIRA <sanpei@yy.cs.keio.ac.jp>	1998-05-20 09:05:48 +00:00
peter	8da2aa7242	s/flags/flag/	1998-05-20 08:05:45 +00:00
peter	bf07d95540	A cleaner fix for PR#5102, clear nonsense flags at mount time rather than in the core of nfs_bio.c at the 11th hour. PR: 5102	1998-05-20 08:02:24 +00:00
peter	e207cd01f6	Don't change argp->flags after it's been copied.	1998-05-20 07:59:21 +00:00
peter	1777a04b11	Allow control of the attribute cache timeouts at mount time. We had run out of bits in the nfs mount flags, I have moved the internal state flags into a seperate variable. These are no longer visible via statfs(), but I don't know of anything that looks at them.	1998-05-19 07:11:27 +00:00
bde	3ee93801b9	Get timespecs directly instead of via timevals.	1998-05-16 16:20:50 +00:00
bde	ac772ab5bb	Don't abuse `+' to combine flags.	1998-05-16 16:03:10 +00:00
bde	bad475eedf	Backed out rev.1.76. It just added style bugs.	1998-05-16 15:21:29 +00:00
bde	76e69cb4f4	Get timespecs directly instead of via timevals.	1998-05-16 15:11:24 +00:00
peter	f11a8e465d	Add missing arg to vget().. Serves me right for committing a 2.2 patch to -current without testing it there.. :-( Submitted by: Michael Hancock <michaelh@cet.co.jp>	1998-05-13 07:49:08 +00:00
peter	395cfb5766	Delete the #if 0 (nearly) duplicate definitions of nfsproto.h. Having these two files that are almost-but-not-quite the same leads to false grep hits, confusion etc. Only installing one copy with a symlink would be nice but that doesn't work with SHARED=symlinks (it changes the source tree).	1998-05-13 06:40:56 +00:00
peter	6613440b58	Hold a reference to the vnode during the sillyrename cleanup. If we block in nfs_vinvalbuf() or the nfs_removeit(), we can have the nfsnode reallocated from underneath us (eg: replaced by a ufs 'struct inode') which can cause disk corruption ('freeing free block' when di_db[5] gets trashed). This is not a cheap fix, but it'll do until the nfsnodes get reference counting and/or locking. Apparently NetBSD have a similar fix (apparently from BSDI). I wish all PR's had this much useful detail. :-) PR: 6611 Submitted by: Stephen Clawson <sclawson@marker.cs.utah.edu>	1998-05-13 06:10:13 +00:00
peter	ea57a1047f	Move the *vpp initialization earlier so that it's set in all error cases. This should stop the 'panic: leaf should not be empty' nfs panic. PR: 1856 Submitted by: msaitoh@spa.is.uec.ac.jp	1998-05-13 05:47:09 +00:00
msmith	964ce778b1	In the words of the submitter: --------- Make callers of namei() responsible for releasing references or locks instead of having the underlying filesystems do it. This eliminates redundancy in all terminal filesystems and makes it possible for stacked transport layers such as umapfs or nullfs to operate correctly. Quality testing was done with testvn, and lat_fs from the lmbench suite. Some NFS client testing courtesy of Patrik Kudo. vop_mknod and vop_symlink still release the returned vpp. vop_rename still releases 4 vnode arguments before it returns. These remaining cases will be corrected in the next set of patches. --------- Submitted by: Michael Hancock <michaelh@cet.co.jp>	1998-05-07 04:58:58 +00:00
msmith	c645da3999	As described by the submitter: Reverse the VFS_VRELE patch. Reference counting of vnodes does not need to be done per-fs. I noticed this while fixing vfs layering violations. Doing reference counting in generic code is also the preference cited by John Heidemann in recent discussions with him. The implementation of alternative vnode management per-fs is still a valid requirement for some filesystems but will be revisited sometime later, most likely using a different framework. Submitted by: Michael Hancock <michaelh@cet.co.jp>	1998-05-06 05:29:41 +00:00
phk	fe94bc8288	Use random() to find our initial xid.	1998-04-06 11:41:07 +00:00
phk	9b703b1455	Eradicate the variable "time" from the kernel, using various measures. "time" wasn't a atomic variable, so splfoo() protection were needed around any access to it, unless you just wanted the seconds part. Most uses of time.tv_sec now uses the new variable time_second instead. gettime() changed to getmicrotime(0. Remove a couple of unneeded splfoo() protections, the new getmicrotime() is atomic, (until Bruce sets a breakpoint in it). A couple of places needed random data, so use read_random() instead of mucking about with time which isn't random. Add a new nfs_curusec() function. Mark a couple of bogosities involving the now disappeard time variable. Update ffs_update() to avoid the weird "== &time" checks, by fixing the one remaining call that passwd &time as args. Change profiling in ncr.c to use ticks instead of time. Resolution is the same. Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call hzto() which subtracts time" sequences. Reviewed by: bde	1998-03-30 09:56:58 +00:00
steve	6b391e2572	Don't allow the readdirplus routine to be used in NFS V2. PR: 5102 Reviewed by: msmith Submitted by: Dmitry Kohmanyuk <dk@farm.org>	1998-03-28 16:05:05 +00:00
bde	a1015f7749	Don't depend on <sys/mount.h> including <sys/socket.h>.	1998-03-28 12:04:40 +00:00
bde	cd450d6714	Moved some #includes from <sys/param.h> nearer to where they are actually used.	1998-03-28 10:33:27 +00:00
tegge	9ac0a4296a	Add a BOOTP_WIRED_TO option, for use on machines with multiple network cards where the first detected card should not be used for bootp. Submitted by: Doug Ambrisko <ambrisko@whistle.com>	1998-03-14 04:13:56 +00:00
tegge	3483aa429f	Update workaround for limitations in the arp code. Adjust the RPC timeout message which occured when the old workaround broke to show the correct IP address.	1998-03-14 03:25:18 +00:00
julian	10c5ccc30a	Reviewed by: dyson@freebsd.org (john Dyson), dg@root.com (david greenman) Submitted by: Kirk McKusick (mcKusick@mckusick.com) Obtained from: WHistle development tree	1998-03-08 09:59:44 +00:00
dyson	8ceb6160f4	This mega-commit is meant to fix numerous interrelated problems. There has been some bitrot and incorrect assumptions in the vfs_bio code. These problems have manifest themselves worse on NFS type filesystems, but can still affect local filesystems under certain circumstances. Most of the problems have involved mmap consistancy, and as a side-effect broke the vfs.ioopt code. This code might have been committed seperately, but almost everything is interrelated. 1) Allow (pmap_object_init_pt) prefaulting of buffer-busy pages that are fully valid. 2) Rather than deactivating erroneously read initial (header) pages in kern_exec, we now free them. 3) Fix the rundown of non-VMIO buffers that are in an inconsistent (missing vp) state. 4) Fix the disassociation of pages from buffers in brelse. The previous code had rotted and was faulty in a couple of important circumstances. 5) Remove a gratuitious buffer wakeup in vfs_vmio_release. 6) Remove a crufty and currently unused cluster mechanism for VBLK files in vfs_bio_awrite. When the code is functional, I'll add back a cleaner version. 7) The page busy count wakeups assocated with the buffer cache usage were incorrectly cleaned up in a previous commit by me. Revert to the original, correct version, but with a cleaner implementation. 8) The cluster read code now tries to keep data associated with buffers more aggressively (without breaking the heuristics) when it is presumed that the read data (buffers) will be soon needed. 9) Change to filesystem lockmgr locks so that they use LK_NOPAUSE. The delay loop waiting is not useful for filesystem locks, due to the length of the time intervals. 10) Correct and clean-up spec_getpages. 11) Implement a fully functional nfs_getpages, nfs_putpages. 12) Fix nfs_write so that modifications are coherent with the NFS data on the server disk (at least as well as NFS seems to allow.) 13) Properly support MS_INVALIDATE on NFS. 14) Properly pass down MS_INVALIDATE to lower levels of the VM code from vm_map_clean. 15) Better support the notion of pages being busy but valid, so that fewer in-transit waits occur. (use p->busy more for pageouts instead of PG_BUSY.) Since the page is fully valid, it is still usable for reads. 16) It is possible (in error) for cached pages to be busy. Make the page allocation code handle that case correctly. (It should probably be a printf or panic, but I want the system to handle coding errors robustly. I'll probably add a printf.) 17) Correct the design and usage of vm_page_sleep. It didn't handle consistancy problems very well, so make the design a little less lofty. After vm_page_sleep, if it ever blocked, it is still important to relookup the page (if the object generation count changed), and verify it's status (always.) 18) In vm_pageout.c, vm_pageout_clean had rotted, so clean that up. 19) Push the page busy for writes and VM_PROT_READ into vm_pageout_flush. 20) Fix vm_pager_put_pages and it's descendents to support an int flag instead of a boolean, so that we can pass down the invalidate bit.	1998-03-07 21:37:31 +00:00
msmith	4df44c447b	Trivial filesystem getpages/putpages implementations, set the second. These should be considered the first steps in a work-in-progress. Submitted by: Terry Lambert <terry@freebsd.org>	1998-03-06 09:46:52 +00:00
msmith	950d32131b	The intent is to get rid of WILLRELE in vnode_if.src by making a complement to all ops that return a vpp, VFS_VRELE. This is initially only for file systems that implement the following ops that do a WILLRELE: vop_create, vop_whiteout, vop_mknod, vop_remove, vop_link, vop_rename, vop_mkdir, vop_rmdir, vop_symlink This is initial DNA that doesn't do anything yet. VFS_VRELE is implemented but not called. A default vfs_vrele was created for fs implementations that use the standard vnode management routines. VFS_VRELE implementations were made for the following file systems: Standard (vfs_vrele) ffs mfs nfs msdosfs devfs ext2fs Custom union umapfs Just EOPNOTSUPP fdesc procfs kernfs portal cd9660 These implementations may change as VOP changes are implemented. In the next phase, in the vop implementations calls to vrele and the vrele part of vput will be moved to the top layer vfs_vnops and made visible to all layers. vput will be replaced by unlock in these cases. Unlocking will still be done in the per fs layer but the refcount decrement will be triggered at the top because it doesn't hurt to hold a vnode reference a little longer. This will have minimal impact on the structure of the existing code. This will only be done for vnode arguments that are released by the various fs vop implementations. Wider use of VFS_VRELE will likely require restructuring of the code. Reviewed by: phk, dyson, terry et. al. Submitted by: Michael Hancock <michaelh@cet.co.jp>	1998-03-01 22:46:53 +00:00
eivind	d7a6ab2803	Staticize.	1998-02-09 06:11:36 +00:00
eivind	4547a09753	Back out DIAGNOSTIC changes.	1998-02-06 12:14:30 +00:00
dyson	fcab598523	Fix an omission of a line from the previous commit to this file. The problem appeared to be an NFS hang.	1998-02-05 16:40:57 +00:00
eivind	c552a9a1c3	Turn DIAGNOSTIC into a new-style option.	1998-02-04 22:34:03 +00:00
bde	ffbb93a37a	Added #include of <sys/queue.h> so that this file is more "self"-sufficent.	1998-02-03 22:19:35 +00:00
bde	742edae5eb	Forward declare some structs so that this file is more self-sufficient.	1998-02-03 21:52:02 +00:00
bde	a1f8745634	Moved declaration of `union nethostadr' outside of the KERNEL section, to give pollution compatible with <nfs/nqfs.h>. At least mount_nfs.c previously had to #define KERNEL before including <nfs/nfs.h> to get this pollution, but this gave other pollution. Moved comment about NFSINT_SIGMASK to immediately before the code that it applies to.	1998-02-01 21:23:29 +00:00
bde	446b81e75d	Forward declare more structs that are used in prototypes here - don't depend on <sys/types.h> forward declaring common ones. Added an underscore to `sin' in prototypes to avoid warnings for the conflict with the ANSI sin().	1998-02-01 20:34:07 +00:00
dyson	2aacd1ab4f	Change the busy page mgmt, so that when pages are freed, they MUST be PG_BUSY. It is bogus to free a page that isn't busy, because it is in a state of being "unavailable" when being freed. The additional advantage is that the page_remove code has a better cross-check that the page should be busy and unavailable for other use. There were some minor problems with the collapse code, and this plugs those subtile "holes." Also, the vfs_bio code wasn't checking correctly for PG_BUSY pages. I am going to develop a more consistant scheme for grabbing pages, busy or otherwise. For now, we are stuck with the current morass.	1998-01-31 11:56:53 +00:00
tegge	72e132a129	Release the buffer when an error occurs while reading directory entries.	1998-01-31 01:27:18 +00:00
dyson	548a436486	Various NFS fixes: Make vfs_bio buffer mgmt work better. Buffers were being used after brelse. Make nfs_getpages work independently of other NFS interfaces. This eliminates some difficult recursion problems and decreases pagefault overhead. Remove an erroneous vfs_unbusy_pages. Fix a reentrancy problem, with nfs_vinvalbuf when vnode is already being rundown. Reassignbuf wasn't being called when needed under certain circumstances. (Thanks to Bill Paul for help.)	1998-01-25 06:24:09 +00:00
dyson	4aef3b4b7a	Various NFS fixes: Make vfs_bio buffer mgmt work better. Buffers were being used after brelse. Make nfs_getpages work independently of other NFS interfaces. This eliminates some difficult recursion problems and decreases pagefault overhead. Remove an erroneous vfs_unbusy_pages. Fix a reentrancy problem, with nfs_vinvalbuf when vnode is already being rundown. Reassignbuf wasn't being called when needed under certain circumstances. (Thanks for help from Bill Paul.)	1998-01-25 06:14:26 +00:00
tegge	5bd44675cc	Increase the minimum bootp reply packet size from 16 (bogus) to 300 (correct).	1998-01-18 18:46:20 +00:00
eivind	57d4125c71	Make the BOOTP family new-style options (in opt_bootp.h)	1998-01-09 03:21:07 +00:00
eivind	bcae2312af	Make INET a proper option. This will not make any of object files that LINT create change; there might be differences with INET disabled, but hardly anything compiled before without INET anyway. Now the 'obvious' things will give a proper error if compiled without inet - ipx_ip, ipfw, tcp_debug. The only thing that _should_ work (but can't be made to compile reasonably easily) is sppp :-( This commit move struct arpcom from <netinet/if_ether.h> to <net/if_arp.h>.	1998-01-08 23:42:31 +00:00
dyson	cb2800cd94	Make our v_usecount vnode reference count work identically to the original BSD code. The association between the vnode and the vm_object no longer includes reference counts. The major difference is that vm_object's are no longer freed gratuitiously from the vnode, and so once an object is created for the vnode, it will last as long as the vnode does. When a vnode object reference count is incremented, then the underlying vnode reference count is incremented also. The two "objects" are now more intimately related, and so the interactions are now much less complex. When vnodes are now normally placed onto the free queue with an object still attached. The rundown of the object happens at vnode rundown time, and happens with exactly the same filesystem semantics of the original VFS code. There is absolutely no need for vnode_pager_uncache and other travesties like that anymore. A side-effect of these changes is that SMP locking should be much simpler, the I/O copyin/copyout optimizations work, NFS should be more ponderable, and further work on layered filesystems should be less frustrating, because of the totally coherent management of the vnode objects and vnodes. Please be careful with your system while running this code, but I would greatly appreciate feedback as soon a reasonably possible.	1998-01-06 05:26:17 +00:00
dyson	cd67bb82fe	Lots of improvements, including restructring the caching and management of vnodes and objects. There are some metadata performance improvements that come along with this. There are also a few prototypes added when the need is noticed. Changes include: 1) Cleaning up vref, vget. 2) Removal of the object cache. 3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore. 4) Correct some missing LK_RETRY's in vn_lock. 5) Correct the page range in the code for msync. Be gentle, and please give me feedback asap.	1997-12-29 00:25:11 +00:00
bde	3c1b6940fc	Unspammed nested include of <vm/vm_zone.h>.	1997-12-27 02:56:39 +00:00
bde	52dbbb4e05	Added a used include. Fixed a gratuitous ANSIism and nearby KNF violations.	1997-12-20 00:25:01 +00:00
dyson	2484645f8d	Various of the ISP users have commented that the 1.41 version of the nfs_bio.c code worked better than the 1.44. This commit reverts the important parts of 1.44 to 1.41, and we will fix it when we can get a handle on the problem.	1997-12-08 00:59:08 +00:00
bde	a323d6696a	Don't call malloc(..., M_WAITOK) at splnet(). Doing so is often a mistake (since softnet interrupts may occur if malloc() waits), and doing it harmlessly but unnecessarily here interfered with detection of the mistaken cases.	1997-11-24 14:18:00 +00:00
julian	ae22df605c	Reviewed by: various. Ever since I first say the way the mount flags were used I've hated the fact that modes, and events, internal and exported, and short-term and long term flags are all thrown together. Finally it's annoyed me enough.. This patch to the entire FreeBSD tree adds a second mount flag word to the mount struct. it is not exported to userspace. I have moved some of the non exported flags over to this word. this means that we now have 8 free bits in the mount flags. There are another two that might well move over, but which I'm not sure about. The only user visible change would have been in pstat -v, except that davidg has disabled it anyhow. I'd still like to move the state flags and the 'command' flags apart from each other.. e.g. MNT_FORCE really doesn't have the same semantics as MNT_RDONLY, but that's left for another day.	1997-11-12 05:42:33 +00:00
phk	ccc7e7fa9f	Rename some local variables to avoid shadowing other local variables. Found by: -Wshadow	1997-11-07 09:21:01 +00:00
phk	4d26888936	Remove a bunch of variables which were unused both in GENERIC and LINT. Found by: -Wunused	1997-11-07 08:53:44 +00:00
phk	4c8218a5c7	Move the "retval" (3rd) parameter from all syscall functions and put it in struct proc instead. This fixes a boatload of compiler warning, and removes a lot of cruft from the sources. I have not removed the /ARGSUSED/, they will require some looking at. libkvm, ps and other userland struct proc frobbing programs will need recompiled.	1997-11-06 19:29:57 +00:00
bde	fb826377ff	Removed unused #includes.	1997-10-28 15:59:26 +00:00
bde	974f2bea15	Don't #include <nfs/nfs.h> in <nfs/nfs_node.h> if KERNEL is defined. Fixed everything that depended on the nested include.	1997-10-28 14:06:25 +00:00
bde	e38eb2dd99	Removed unused #includes. The need for most of them went away with recent changes (docluster* and vfs improvements).	1997-10-27 13:33:47 +00:00
phk	e3cdaf12b2	VFS interior redecoration. Rename vn_default_error to vop_defaultop all over the place. Move vn_bwrite from vfs_bio.c to vfs_default.c and call it vop_stdbwrite. Use vop_null instead of nullop. Move vop_nopoll from vfs_subr.c to vfs_default.c Move vop_sharedlock from vfs_subr.c to vfs_default.c Move vop_nolock from vfs_subr.c to vfs_default.c Move vop_nounlock from vfs_subr.c to vfs_default.c Move vop_noislocked from vfs_subr.c to vfs_default.c Use vop_ebadf instead of *_ebadf. Add vop_defaultop for getpages on master vnode in MFS.	1997-10-26 20:55:39 +00:00
phk	190f3b183f	Always initialize the syscall vectors for our "private" syscalls (not just in the LKM case). Plug nqnfs_vop_lease_check directly into the default_vnodeop_p table.	1997-10-26 20:13:52 +00:00
phk	f82436f706	VFS clean up "hekto commit" 1. Add defaults for more VOPs VOP_LOCK vop_nolock VOP_ISLOCKED vop_noislocked VOP_UNLOCK vop_nounlock and remove direct reference in filesystems. 2. Rename the nfsv2 vnop tables to improve sorting order.	1997-10-16 22:01:05 +00:00
phk	373a865574	Another VFS cleanup "kilo commit" 1. Remove VOP_UPDATE, it is (also) an UFS/{FFS,LFS,EXT2FS,MFS} intereface function, and now lives in the ufsmount structure. 2. Remove VOP_SEEK, it was unused. 3. Add mode default vops: VOP_ADVLOCK vop_einval VOP_CLOSE vop_null VOP_FSYNC vop_null VOP_IOCTL vop_enotty VOP_MMAP vop_einval VOP_OPEN vop_null VOP_PATHCONF vop_einval VOP_READLINK vop_einval VOP_REALLOCBLKS vop_eopnotsupp And remove identical functionality from filesystems 4. Add vop_stdpathconf, which returns the canonical stuff. Use it in the filesystems. (XXX: It's probably wrong that specfs and fifofs sets this vop, shouldn't it come from the "host" filesystem, for instance ufs or cd9660 ?) 5. Try to make system wide VOP functions have vop_* names. 6. Initialize the um_* vectors in LFS. (Recompile your LKMS!!!)	1997-10-16 20:32:40 +00:00
phk	d166441755	VFS mega cleanup commit (x/N) 1. Add new file "sys/kern/vfs_default.c" where default actions for VOPs go. Implement proper defaults for ABORTOP, BWRITE, LEASE, POLL, REVOKE and STRATEGY. Various stuff spread over the entire tree belongs here. 2. Change VOP_BLKATOFF to a normal function in cd9660. 3. Kill VOP_BLKATOFF, VOP_TRUNCATE, VOP_VFREE, VOP_VALLOC. These are private interface functions between UFS and the underlying storage manager layer (FFS/LFS/MFS/EXT2FS). The functions now live in struct ufsmount instead. 4. Remove a kludge of VOP_ functions in all filesystems, that did nothing but obscure the simplicity and break the expandability. If a filesystem doesn't implement VOP_FOO, it shouldn't have an entry for it in its vnops table. The system will try to DTRT if it is not implemented. There are still some cruft left, but the bulk of it is done. 5. Fix another VCALL in vfs_cache.c (thanks Bruce!)	1997-10-16 10:50:27 +00:00
phk	f7aabc3ac9	vnops megacommit 1. Use the default function to access all the specfs operations. 2. Use the default function to access all the fifofs operations. 3. Use the default function to access all the ufs operations. 4. Fix VCALL usage in vfs_cache.c 5. Use VOCALL to access specfs functions in devfs_vnops.c 6. Staticize most of the spec and fifofs vnops functions. 7. Make UFS panic if it lacks bits of the underlying storage handling.	1997-10-15 13:24:07 +00:00
phk	92eeb70dc6	Hmm, realign the vnops into two columns.	1997-10-15 10:05:29 +00:00
phk	26130e0b77	Stylistic overhaul of vnops tables. 1. Remove comment stating the blatantly obvious. 2. Align in two columns. 3. Sort all but the default element alphabetically. 4. Remove XXX comments pointing out entries not needed.	1997-10-15 09:22:02 +00:00

... 2 3 4 5 6 ...

540 Commits