freebsd-nq

Author	SHA1	Message	Date
Matthew Dillon	ea94c7b968	Synopsis of problem being fixed: Dan Nelson originally reported that blocks of zeros could wind up in a file written to over NFS by a client. The problem only occurs a few times per several gigabytes of data. This problem turned out to be bug #3 below. bug #1: B_CLUSTEROK must be cleared when an NFS buffer is reverted from stage 2 (ready for commit rpc) to stage 1 (ready for write). Reversions can occur when a dirty NFS buffer is redirtied with new data. Otherwise the VFS/BIO system may end up thinking that a stage 1 NFS buffer is clusterable. Stage 1 NFS buffers are not clusterable. bug #2: B_CLUSTEROK was inappropriately set for a 'short' NFS buffer (short buffers only occur near the EOF of the file). Change to only set when the buffer is a full biosize (usually 8K). This bug has no effect but should be fixed in -current anyway. It need not be backported. bug #3: B_NEEDCOMMIT was inappropriately set in nfs_flush() (which is typically only called by the update daemon). nfs_flush() does a multi-pass loop but due to the lack of vnode locking it is possible for new buffers to be added to the dirtyblkhd list while a flush operation is going on. This may result in nfs_flush() setting B_NEEDCOMMIT on a buffer which has NOT yet gone through its stage 1 write, causing only the commit rpc to be made and thus causing the contents of the buffer to be thrown away (never sent to the server). The patch also contains some cleanup, which only applies to the commit into -current. Reviewed by: dg, julian Originally Reported by: Dan Nelson <dnelson@emsphone.com>	1999-12-12 06:09:57 +00:00
Matthew Dillon	b314ed9662	nm_srtt and nm_sdrtt are arrays[4]. Remove explicit initialization of element [4] in both, which goes beyond the end of the array, leaving [0], [1], [2], and [3]. This bug did not cause any problems since the overrun fields are initialized after the bogus array init but needs to be fixed anyway. Submitted by: Ian Dowse <iedowse@maths.tcd.ie>	1999-11-22 04:50:09 +00:00
Eivind Eklund	dd8c04f4c7	Remove WILLRELE from VOP_SYMLINK Note: Previous commit to these files (except coda_vnops and devfs_vnops) that claimed to remove WILLRELE from VOP_RENAME actually removed it from VOP_MKNOD.	1999-11-13 20:58:17 +00:00
Eivind Eklund	edfe736df9	Remove WILLRELE from VOP_RENAME	1999-11-12 03:34:28 +00:00
Matthew Dillon	a6aa6d9137	Remove special case socket sharing code in order to allow nfsd to bind IP addresses to udp/cltp sockets separately. PR: kern/13049 Reviewed by: David Malone <dwmalone@maths.tcd.ie>, freebsd-current	1999-11-11 17:24:02 +00:00
Matthew Dillon	6b21e94604	Fix nfssvc_addsock() to not attempt to free a NULL socket structure when returning an error. Bug fix was extracted from the PR. The PR is not yet entirely resolved by this commit. PR: kern/13049 Reviewed by: Matt Dillon <dillon@freebsd.org> Submitted by: Ian Dowse <iedowse@maths.tcd.ie>	1999-11-08 19:10:16 +00:00
Matthew Dillon	a5d3fe3f85	Move NFS access cache hits/misses into nfsstats structure so /usr/bin/nfsstat can get to it easily.	1999-10-25 19:22:33 +00:00
Poul-Henning Kamp	3b6fb88590	Before we start to mess with the VFS name-cache clean things up a little bit: Isolate the namecache in its own file, and give it a dedicated malloc type.	1999-10-03 12:18:29 +00:00
Marcel Moolenaar	16df98ecc6	Careless use of struct proc *p caused major problems. 'p' is allowed to be NULL in this function (nfs_sigintr). Reorder the statements and guard them all with a single if (p != NULL). reported, reviewed and tested by: jdp	1999-09-29 20:12:39 +00:00
Matthew Dillon	13e14363fe	Make FreeBSD less conservative in determining when to return a cookie error for a directory. I have made this change after a great deal of review although I cannot be absolutely sure that this meets the spec. The issue devolves into whether changes in an underlying (UFS) directory can cause NFS directory blocks to be renumbered. My read of the code indicates that NFS directory blocks will not be renumbered, which means that the cookies should still remain valid after a change is made to the underlying directory. This being the case, a cookie error should not be returned when a change is made to the underlying directory and, instead, the NFS client should rely on mtime detection to invalidate and reload the directory. The use of mtime is problematic in of itself, due to insufficient resolution, which is why I believe the original conservative error handling was done. Still, there have been dozens of bug reports by people needing solaris<->FreeBSD interoperability and these have to be accomodated.	1999-09-29 17:14:58 +00:00
Marcel Moolenaar	2c42a14602	sigset_t change (part 2 of 5) ----------------------------- The core of the signalling code has been rewritten to operate on the new sigset_t. No methodological changes have been made. Most references to a sigset_t object are through macros (see signalvar.h) to create a level of abstraction and to provide a basis for further improvements. The NSIG constant has not been changed to reflect the maximum number of signals possible. The reason is that it breaks programs (especially shells) which assume that all signals have a non-null name in sys_signame. See src/bin/sh/trap.c for an example. Instead _SIG_MAXSIG has been introduced to hold the maximum signal possible with the new sigset_t. struct sigprop has been moved from signalvar.h to kern_sig.c because a) it is only used there, and b) access must be done though function sigprop(). The latter because the table doesn't holds properties for all signals, but only for the first NSIG signals. signal.h has been reorganized to make reading easier and to add the new and/or modified structures. The "old" structures are moved to signalvar.h to prevent namespace polution. Especially the coda filesystem suffers from the change, because it contained lines like (p->p_sigmask == SIGIO), which is easy to do for integral types, but not for compound types. NOTE: kdump (and port linux_kdump) must be recompiled. Thanks to Garrett Wollman and Daniel Eischen for pressing the importance of changing sigreturn as well.	1999-09-29 15:03:48 +00:00
Matthew Dillon	b5acbc8b9c	Asynchronized client-side nfs_commit. NFS commit operations were previously issued synchronously even if async daemons (nfsiod's) were available. The commit has been moved from the strategy code to the doio code in order to asynchronize it. Removed use of lastr in preparation for removal of vnode->v_lastr. It has been replaced with seqcount, which is already supported by the system and, in fact, gives us a better heuristic for sequential detection then lastr ever did. Made major performance improvements to the server side commit. The server previously fsync'd the entire file for each commit rpc. The server now bawrite()s only those buffers related to the offset/size specified in the commit rpc. Note that we do not commit the meta-data yet. This works still needs to be done. Note that a further optimization can be done (and has not yet been done) on the client: we can merge multiple potential commit rpc's into a single rpc with a greater file offset/size range and greatly reduce rpc traffic. Reviewed by: Alan Cox <alc@cs.rice.edu>, David Greenman <dg@root.com>	1999-09-17 05:57:57 +00:00
Alfred Perlstein	c24fda81c9	Seperate the export check in VFS_FHTOVP, exports are now checked via VFS_CHECKEXP. Add fh(open\|stat\|stafs) syscalls to allow userland to query filesystems based on (network) filehandle. Obtained from: NetBSD	1999-09-11 00:46:08 +00:00
Poul-Henning Kamp	9626728875	remove unused variables.	1999-08-28 19:21:03 +00:00
Peter Wemm	c3aac50f28	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
Poul-Henning Kamp	dbafb3660f	Simplify the handling of VCHR and VBLK vnodes using the new dev_t: Make the alias list a SLIST. Drop the "fast recycling" optimization of vnodes (including the returning of a prexisting but stale vnode from checkalias). It doesn't buy us anything now that we don't hardlimit vnodes anymore. Rename checkalias2() and checkalias() to addalias() and addaliasu() - which takes dev_t and udev_t arg respectively. Make the revoke syscalls use vcount() instead of VALIASED. Remove VALIASED flag, we don't need it now and it is faster to traverse the much shorter lists than to maintain the flag. vfs_mountedon() can check the dev_t directly, all the vnodes point to the same one. Print the devicename in specfs/vprint(). Remove a couple of stale LFS vnode flags. Remove unimplemented/unused LK_DRAINED;	1999-08-26 14:53:31 +00:00
Peter Wemm	ac7cc2e469	Convert all the nfs macros to do { blah } while (0) to ensure it works correctly in if/else etc. egcs had probably picked up most of the problems here before with "ambiguous braces" etc, but this should increase the robustness a bit. Based on an idea from Eivind Eklund.	1999-08-19 14:50:12 +00:00
Poul-Henning Kamp	0ef1c82630	Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>, a few lines into <sys/vnode.h>. Add a few fields to struct specinfo, paving the way for the fun part.	1999-08-08 18:43:05 +00:00
Peter Wemm	56ba093ddb	Don't over-allocate and over-copy shorter NFSv2 filehandles and then correct the pointers afterwards. It's kinda bogus that we generate a 24 (?) byte filehandle (2 x int32 fsid and 16 byte VFS fhandle) and pad it out to 64 bytes for NFSv3 with garbage. The whole point of NFSv3's variable filehandle length was to allow for shorter handles, both in memory and over the wire. I plan on taking a shot at fixing this shortly.	1999-08-04 14:41:39 +00:00
Bill Paul	9c9743b67b	Correct the sanity test length calculation in nfsrv_readdirplus(): len is being incremented by 4 bytes too few each time through the loop, which allows more data into the mbuf chain that we really want. In the worst case, when we're using 32K read/write sizes with a TCP client, this causes readdirplus replies to sometimes exceed NFS_MAXPACKET which leads to a panic. This problem cropped up for me using an IRIX 6.5.4 NFSv3 TCP client with 32K read/write sizes, however supposedly it can be triggered by WinNT NFS servers too. In theory, it can probably be triggered by any NFS v3 implementation using TCP as long as it's using the maxiumum block size. Reviewed by: Matthew Dillon <dillon@backplane.com>	1999-07-29 21:42:57 +00:00
Alan Cox	3b5f11efe6	Clear error in nfsrv_create when we have a valid reply so that that reply is actually transmitted. Submitted by: dillon	1999-07-28 08:20:49 +00:00
Poul-Henning Kamp	f008cfcc1a	I have not one single time remembered the name of this function correctly so obviously I gave it the wrong name. s/umakedev/makeudev/g	1999-07-17 18:43:50 +00:00
Julian Elischer	3ba6a72322	Submitted by: "David E. Cross" <crossd@cs.rpi.edu> Matt missed a line..	1999-06-30 04:29:13 +00:00
Peter Wemm	e96c1fdc3f	Minor tweaks to make sure (new) prerequisites for <sys/buf.h> (mostly splbio()/splx()) are #included in time.	1999-06-27 11:44:22 +00:00
Kirk McKusick	67812eacd7	Convert buffer locking from using the B_BUSY and B_WANTED flags to using lockmgr locks. This commit should be functionally equivalent to the old semantics. That is, all buffer locking is done with LK_EXCLUSIVE requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will be done in future commits.	1999-06-26 02:47:16 +00:00
Julian Elischer	3d84d191cc	Matt's NFS fixes. Submitted by: Matt Dillon Reviewed by: David Cross, Julian Elischer, Mike Smith, Drew Gallatin 3.2 version to follow when tested	1999-06-23 04:44:14 +00:00
Peter Wemm	b903b04cc0	Various changes lifted from the OpenBSD cvs tree: txdr_hyper and fxdr_hyper tweaks to avoid excessive CPU order knowledge. nfs_serv.c: don't call nfsm_adj() with negative values, windows clients could crash servers when doing a readdir of a large directory. nfs_socket.c: Use IP_PORTRANGE to get a priviliged port without a spin loop trying to bind(). Don't clobber a mbuf pointer or we get panics on a NFS3ERR_JUKEBOX error from a server when reusing a freed mbuf. nfs_subs.c: Don't loose st_blocks on NFSv2 mounts when > 2GB. Obtained from: OpenBSD	1999-06-05 05:35:03 +00:00
Poul-Henning Kamp	bfbb9ce670	Divorce "dev_t" from the "major\|minor" bitmap, which is now called udev_t in the kernel but still called dev_t in userland. Provide functions to manipulate both types: major() umajor() minor() uminor() makedev() umakedev() dev2udev() udev2dev() For now they're functions, they will become in-line functions after one of the next two steps in this process. Return major/minor/makedev to macro-hood for userland. Register a name in cdevsw[] for the "filedescriptor" driver. In the kernel the udev_t appears in places where we have the major/minor number combination, (ie: a potential device: we may not have the driver nor the device), like in inodes, vattr, cdevsw registration and so on, whereas the dev_t appears where we carry around a reference to a actual device. In the future the cdevsw and the aliased-from vnode will be hung directly from the dev_t, along with up to two softc pointers for the device driver and a few houskeeping bits. This will essentially replace the current "alias" check code (same buck, bigger bang). A little stunt has been provided to try to catch places where the wrong type is being used (dev_t vs udev_t), if you see something not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if it makes a difference. If it does, please try to track it down (many hands make light work) or at least try to reproduce it as simply as possible, and describe how to do that. Without DEVT_FASCIST I belive this patch is a no-op. Stylistic/posixoid comments about the userland view of the <sys/*.h> files welcome now, from userland they now contain the end result. Next planned step: make all dev_t's refer to the same devsw[] which means convert BLK's to CHR's at the perimeter of the vnodes and other places where they enter the game (bootdev, mknod, sysctl).	1999-05-11 19:55:07 +00:00
Poul-Henning Kamp	b0eeea2042	remove b_proc from struct buf, it's (now) unused. Reviewed by: dillon, bde	1999-05-06 20:00:34 +00:00
Peter Wemm	dfd5dee1b0	Add sufficient braces to keep egcs happy about potentially ambiguous if/else nesting.	1999-05-06 18:13:11 +00:00
Alan Cox	4221e284a3	The VFS/BIO subsystem contained a number of hacks in order to optimize piecemeal, middle-of-file writes for NFS. These hacks have caused no end of trouble, especially when combined with mmap(). I've removed them. Instead, NFS will issue a read-before-write to fully instantiate the struct buf containing the write. NFS does, however, optimize piecemeal appends to files. For most common file operations, you will not notice the difference. The sole remaining fragment in the VFS/BIO system is b_dirtyoff/end, which NFS uses to avoid cache coherency issues with read-merge-write style operations. NFS also optimizes the write-covers-entire-buffer case by avoiding the read-before-write. There is quite a bit of room for further optimization in these areas. The VM system marks pages fully-valid (AKA vm_page_t->valid = VM_PAGE_BITS_ALL) in several places, most noteably in vm_fault. This is not correct operation. The vm_pager_get_pages() code is now responsible for marking VM pages all-valid. A number of VM helper routines have been added to aid in zeroing-out the invalid portions of a VM page prior to the page being marked all-valid. This operation is necessary to properly support mmap(). The zeroing occurs most often when dealing with file-EOF situations. Several bugs have been fixed in the NFS subsystem, including bits handling file and directory EOF situations and buf->b_flags consistancy issues relating to clearing B_ERROR & B_INVAL, and handling B_DONE. getblk() and allocbuf() have been rewritten. B_CACHE operation is now formally defined in comments and more straightforward in implementation. B_CACHE for VMIO buffers is based on the validity of the backing store. B_CACHE for non-VMIO buffers is based simply on whether the buffer is B_INVAL or not (B_CACHE set if B_INVAL clear, and vise-versa). biodone() is now responsible for setting B_CACHE when a successful read completes. B_CACHE is also set when a bdwrite() is initiated and when a bwrite() is initiated. VFS VOP_BWRITE routines (there are only two - nfs_bwrite() and bwrite()) are now expected to set B_CACHE. This means that bowrite() and bawrite() also set B_CACHE indirectly. There are a number of places in the code which were previously using buf->b_bufsize (which is DEV_BSIZE aligned) when they should have been using buf->b_bcount. These have been fixed. getblk() now clears B_DONE on return because the rest of the system is so bad about dealing with B_DONE. Major fixes to NFS/TCP have been made. A server-side bug could cause requests to be lost by the server due to nfs_realign() overwriting other rpc's in the same TCP mbuf chain. The server's kernel must be recompiled to get the benefit of the fixes. Submitted by: Matthew Dillon <dillon@apollo.backplane.com>	1999-05-02 23:57:16 +00:00
Poul-Henning Kamp	75c1354190	This Implements the mumbled about "Jail" feature. This is a seriously beefed up chroot kind of thing. The process is jailed along the same lines as a chroot does it, but with additional tough restrictions imposed on what the superuser can do. For all I know, it is safe to hand over the root bit inside a prison to the customer living in that prison, this is what it was developed for in fact: "real virtual servers". Each prison has an ip number associated with it, which all IP communications will be coerced to use and each prison has its own hostname. Needless to say, you need more RAM this way, but the advantage is that each customer can run their own particular version of apache and not stomp on the toes of their neighbors. It generally does what one would expect, but setting up a jail still takes a little knowledge. A few notes: I have no scripts for setting up a jail, don't ask me for them. The IP number should be an alias on one of the interfaces. mount a /proc in each jail, it will make ps more useable. /proc/<pid>/status tells the hostname of the prison for jailed processes. Quotas are only sensible if you have a mountpoint per prison. There are no privisions for stopping resource-hogging. Some "#ifdef INET" and similar may be missing (send patches!) If somebody wants to take it from here and develop it into more of a "virtual machine" they should be most welcome! Tools, comments, patches & documentation most welcome. Have fun... Sponsored by: http://www.rndassociates.com/ Run for almost a year by: http://www.servetheweb.com/	1999-04-28 11:38:52 +00:00
Poul-Henning Kamp	f711d546d2	Suser() simplification: 1: s/suser/suser_xxx/ 2: Add new function: suser(struct proc ), prototyped in <sys/proc.h>. 3: s/suser_xxx($[a-zA-Z0-9_]$->p_ucred, \&\1->p_acflag)/suser(\1)/ The remaining suser_xxx() calls will be scrutinized and dealt with later. There may be some unneeded #include <sys/cred.h>, but they are left as an exercise for Bruce. More changes to the suser() API will come along with the "jail" code.	1999-04-27 11:18:52 +00:00
Dmitrij Tejblum	c1eefce941	Fixed printf format errors on alpha.	1999-04-24 11:29:48 +00:00
Peter Wemm	803870b48d	Untangle the nfs send and receive queue locking a little. One lock routine was [ab]used for two different things, and you couldn't tell from the wait channel which one had wedged. Catch a few things missing from NFS_NOSERVER.	1999-02-25 00:03:51 +00:00
Doug Rabson	ef5253d801	Move the declaration of the vfs.nfs sysctl node outside an ifdef so that it builds if NFS_NOSERVER is defined. Spotted by: Bruce Evans <bde@zeta.org.au>	1999-02-18 09:19:41 +00:00
Bruce Evans	1f2e401efc	Fixed bitrot in NFS_ACDEBUG option.	1999-02-17 13:59:29 +00:00
Doug Rabson	ce02431ffa	* Change sysctl from using linker_set to construct its tree using SLISTs. This makes it possible to change the sysctl tree at runtime. * Change KLD to find and register any sysctl nodes contained in the loaded file and to unregister them when the file is unloaded. Reviewed by: Archie Cobbs <archie@whistle.com>, Peter Wemm <peter@netplex.com.au> (well they looked at it anyway)	1999-02-16 10:49:55 +00:00
Matthew Dillon	831a80b0d5	Fix warnings in preparation for adding -Wall -Wcast-qual to the kernel compile	1999-01-27 22:42:27 +00:00
Matthew Dillon	1c7c3c6a86	This is a rather large commit that encompasses the new swapper, changes to the VM system to support the new swapper, VM bug fixes, several VM optimizations, and some additional revamping of the VM code. The specific bug fixes will be documented with additional forced commits. This commit is somewhat rough in regards to code cleanup issues. Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>	1999-01-21 08:29:12 +00:00
Eivind Eklund	fb1167777a	Remove the 'waslocked' parameter to vfs_object_create().	1999-01-05 18:50:03 +00:00
Tim Vanderhoek	dea9268b70	Silence -Wtrigraph. Submitted by: Bradley Dunn <bradley@dunn.org> (pr: kern/8817)	1998-12-30 00:37:44 +00:00
Doug Rabson	6cd60632a6	Fix for creating files on a Solaris 7 server with NFSv3 (the request was slightly garbled but older servers seemed to understand it). Reviewed by: David O'Brien <obrien@nuxi.ucdavis.edu>	1998-12-25 10:34:27 +00:00
Dmitrij Tejblum	85f118c801	Added 3 new errno values, requred by various standards: EOVERFLOW, ECANCELED, EILSEQ. Fixed ibcs2 and especially linux EIDRM and ENOMSG errno mapping. Reviewed by: Dan Nelson <dnelson@emsphone.com>	1998-12-14 18:54:04 +00:00
Eivind Eklund	5fd7941bd3	Remove the if fixed in the last commit; bde quite correctly point out that it can never fail.	1998-12-09 15:12:53 +00:00
Eivind Eklund	d27dddc9d5	Fix typo (; in "if (vp == NULL);").	1998-12-08 23:11:24 +00:00
Archie Cobbs	f1d19042b0	The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static and local variables, goto labels, and functions declared but not defined.	1998-12-07 21:58:50 +00:00
Doug Rabson	86442b5201	Fix a panic in nfsrv_dorec() where a NULL pointer could be passed to free() sometimes. Reviewed by: Eric Haug <ejh@eas.slu.edu>	1998-11-13 09:44:12 +00:00
Peter Wemm	dad00f4e9c	Remove [apparently] bogus casts to u_long for the vnode_pager_setsize() second argument. np_size is a 64 bit int, so is the second arg. This might have caused needless 2G/4G file size problems. I believe it was Bruce who queried this.	1998-11-09 07:00:14 +00:00
Peter Wemm	1f2edded90	vm_object_page_clean() last arg changed from TRUE to OBJPC_SYNC. I'm not sure that this is necessary to be a sync write here since a VOP_FSYNC() follows and it will schedule, sort and complete the writes that the vm_object_page_clean() started (as I think I understand things).	1998-10-31 15:39:31 +00:00
Peter Wemm	40c8cfe552	Use TAILQ macros for clean/dirty block list processing. Set b_xflags rather than abusing the list next pointer with a magic number.	1998-10-31 15:31:29 +00:00
Kirk McKusick	96438eb911	The code checks each fragment mark to see if it's valid; if the fragment is less than NFS_MINPACKET or greater than NFS_MAXPACKET in size, it barfs and, I think, drops the connection. However, there's no guarantee that in a multi-fragment RPC, all the fragments will be at least as large as NFS_MINPACKET. In fact, with the version of "tclnfs" we have here, which supports NFS over TCP, at least when built under SunOS 4.1.3 (i.e., with 4.1.3's user-mode ONC RPC library), I can repeatably cause "tclnfs" to send a request with more than one fragment, one of which is only 8 bytes long. I just do a 3877-byte write to a file, at an offset of 0. The check that "slp->ns_reclen" is greater than or equal to NFS_MINPACKET serves no useful purpose - if the NFS server code can't handle packets < NFS_MINPACKET bytes, it can't handle them over any protocol, so the check has to be done above the RPC-over-TCP layer - and should be removed. Obtained from: Fix from Guy Harris, forwarded by Rick Macklem.	1998-09-29 22:33:05 +00:00
Bruce Evans	cae300be0f	Made unloading of the nfs LKM sort of work. This is mainly to test detachment of vfs sysctls. Unloading of vfs LKMs doesn't actually work for any vfs, since it leaves garbage pointers to memory allocation control structures.	1998-09-07 05:42:15 +00:00
Bruce Evans	500b04a257	Instantiate `nfs_mount_type' in a standard file so that it is present when nfs is an LKM. Declare it in a header file. Don't forget to use it in non-Lite2 code. Initialize it to -1 instead of to 0, since 0 will soon be the mount type number for the first vfs loaded. NetBSD uses strcmp() to avoid this ugly global.	1998-09-05 15:17:34 +00:00
Luoqi Chen	4ef872a4c5	Check for NULL pointer before freeing a struct sockaddr. m_freem() can handle NULL, buf free() can't.	1998-09-01 02:31:52 +00:00
Garrett Wollman	cfe8b629f1	Yow! Completely change the way socket options are handled, eliminating another specialized mbuf type in the process. Also clean up some of the cruft surrounding IPFW, multicast routing, RSVP, and other ill-explored corners.	1998-08-23 03:07:17 +00:00
Peter Wemm	c5fa8d1a2c	If we get an ENOBUFS from the network, it's normally transient network interface congestion (eg: nfs over a ppp link, etc). Don't log these for UDP mounts, and don't cause syscalls to fail with EINTR. This stops the 'nfs send error 55' warnings. If the error is because the system is really hosed, this is the least of your problems...	1998-08-01 09:04:02 +00:00
Bruce Evans	a23d65bfc8	Cast pointers to uintptr_t/intptr_t instead of to u_long/long, respectively. Most of the longs should probably have been u_longs, but this changes is just to prevent warnings about casts between pointers and integers of different sizes, not to fix poorly chosen types.	1998-07-15 02:32:35 +00:00
KATO Takenori	936f266f99	Moved `#ifndef NFS_NOSERVER' after including nfs.h.	1998-07-02 12:41:42 +00:00
John-Mark Gurney	56786ee91b	fix buildworld hopefully be3fore anyone complains... NFS_*TIMO should possibly be converted to sysctl vars (jkh's suggestion), but in some cases it looks like nfs keeps a copy of the value in a struct hash sizes are already ifdef'd KERNEL, so there aren't userland inpact from them...	1998-06-30 11:19:22 +00:00
John-Mark Gurney	df394affa2	convert some nfs tunables to options, these are: NFS_MINATTRTIMO VREG attrib cache timeout in sec NFS_MAXATTRTIMO NFS_MINDIRATTRTIMO VDIR attrib cache timeout in sec NFS_MAXDIRATTRTIMO NFS_GATHERDELAY Default write gather delay (msec) NFS_UIDHASHSIZ Tune the size of nfssvc_sock with this NFS_WDELAYHASHSIZ and with this NFS_MUIDHASHSIZ Tune the size of nfsmount with this NFS_NOSERVER (already documented in LINT) NFS_DEBUG turn on NFS debugging also, because NFS_ROOT is used by very different files, it has been renamed to opt_nfsroot.h instead of the old opt_nfs.h....	1998-06-30 03:01:37 +00:00
Bruce Evans	29c0cb37eb	Fixed typo in ifdefed code. (NFS_ACDEBUG is not in LINT. Therefore, code controlled by it did not even compile.)	1998-06-21 12:50:12 +00:00
Bruce Evans	4c4918c9e4	Avoid an egcs pessimization for 64-bit signed division on i386's. Pre-2.8 versions of gcc generate a call to __divdi3() for all 64-bit signed divisions, but egcs optimizes them to a shift and fixup when the divisor is a constant power of 2. Unfortunately, it generates a call to __cmpdi2() for the fixup, although all except possibly ancient versions of gcc and egcs do ordinary 64-bit comparisons inline.	1998-06-14 15:52:00 +00:00
Doug Rabson	ecbb00a262	This commit fixes various 64bit portability problems required for FreeBSD/alpha. The most significant item is to change the command argument to ioctl functions from int to u_long. This change brings us inline with various other BSD versions. Driver writers may like to use (__FreeBSD_version == 300003) to detect this change. The prototype FreeBSD/alpha machdep will follow in a couple of days time.	1998-06-07 17:13:14 +00:00
Peter Wemm	4152886f7a	For the on-the-wire protocol, u_long -> u_int32_t; long -> int32_t; int -> int32_t; u_short -> u_int16_t. Also, use mode_t instead of u_short for storing modes (mode_t is a u_int16_t). Obtained from: NetBSD	1998-05-31 20:09:01 +00:00
Peter Wemm	75c6892c16	Support 'mount -u' remounts. This may require disconnecting and rebinding the socket. Certain mode changes are not allowed. Obtained from: NetBSD	1998-05-31 19:49:31 +00:00
Peter Wemm	261114d95c	Cut-n-paste glitch	1998-05-31 19:43:34 +00:00
Peter Wemm	71c667c91b	Prototype support for selectively allowing non-reserved ports on a per export basis. Needs userland support yet. Obtained from: NetBSD	1998-05-31 19:16:08 +00:00
Peter Wemm	a422fed096	Hide whiteouts from NFS, since the protocol doesn't support them. Obtained from: NetBSD	1998-05-31 19:10:52 +00:00
Peter Wemm	c03d64df19	NetBSD has a comment that Solaris 2.5 doesn't do verifiers correctly, we have weakened this test already for Digital Unix, so it may be enough for Solaris. It needs to be checked again. Obtained from: NetBSD	1998-05-31 19:07:47 +00:00
Peter Wemm	13b9f88167	Don't pass a second copy of the uid/gid in with the v2/v3 sattr structures, it just makes more work. We pass a copy of the uid/gid with the credentials. (although, this may need to be revisited if a non AUTHUNIX authentication method (such as NFSKERB) ever gets implemented). Obtained from: NetBSD	1998-05-31 19:00:19 +00:00
Peter Wemm	d0e443aa3a	Use the new SB_UPCALL flag, Obtained from: NetBSD (but I changed the flag clear order in case).	1998-05-31 18:46:06 +00:00
Peter Wemm	e9156323b8	Don't try and free mrep twice on some error conditions. Obtained from: NetBSD	1998-05-31 18:19:43 +00:00
Peter Wemm	6301c8c330	#ifdef a diagnostic panic, plus another missed costmetic change. Obtained from: NetBSD	1998-05-31 18:11:03 +00:00
Peter Wemm	1da42e389c	We have gained 2 more errno's, add them to the NFSv2 mapping table.	1998-05-31 18:09:18 +00:00
Peter Wemm	946010a5a4	Missed a cosmetic change that the other BSD's have.	1998-05-31 18:08:09 +00:00
Peter Wemm	535fa8520e	oops, nfs_msg() is called from client code too.	1998-05-31 18:06:07 +00:00
Peter Wemm	4a5f4c547e	When we can't reconnect a socket, don't forget to unlock before retrying or we can deadlock. Obtained from: NetBSD	1998-05-31 18:02:56 +00:00
Peter Wemm	6bea90a1ee	Don't log zero length reads, this can happen during normal operation. Obtained from: NetBSD	1998-05-31 18:00:46 +00:00
Peter Wemm	6c1a945540	Consider for readdir chunk sizes when tuning socket buffer reservations. Obtained from: NetBSD	1998-05-31 17:57:43 +00:00
Peter Wemm	dde4499fec	Refuse READDIR / READDIRPLUS rpc's for non-directories Obtained from: NetBSD	1998-05-31 17:54:18 +00:00
Peter Wemm	c489c83e4c	Some const's Obtained from: NetBSD	1998-05-31 17:48:07 +00:00
Peter Wemm	e8cf20c8db	NFS Jumbo commit part 1. Cosmetic and structural changes only. The aim of this part of commits is to minimize unnecessary differences between the other NFS's of similar origin. Yes, there are gratuitous changes here that the style folks won't like, but it makes the catch-up less difficult.	1998-05-31 17:27:58 +00:00
Peter Wemm	7c1c33a7dd	When using NFSv3, use the remote server's idea of the maximum file size rather than assuming 2^64. It may not like files that big. :-) On the nfs server, calculate and report the max file size as the point that the block numbers in the cache would turn negative. (ie: 1099511627775 bytes (1TB)). One of the things I'm worried about however, is that directory offsets are really cookies on a NFSv3 server and can be rather large, especially when/if the server generates the opaque directory cookies by using a local filesystem offset in what comes out as the upper 32 bits of the 64 bit cookie. (a server is free to do this, it could save byte swapping depending on the native 64 bit byte order) Obtained from: NetBSD	1998-05-30 16:33:58 +00:00
Peter Wemm	0d7d0fcf29	Convert a couple of large allocations to use zones rather than malloc for better packing. This means that we can choose better values for the various hash entries without having to try and get it all to fit within an artificial power of two limit for malloc's sake.	1998-05-24 14:41:56 +00:00
Peter Wemm	4204769d9e	Only ignore "owner" permissions selectively rather than always. In some cases we ignore it (eg: read/write) to maintain chmod-after-open semantics but in other cases we do care, eg: creating files, access() etc. Never ignore errors from VOP_ACCESS() on immutable files. This apparently comes from BSDI (from Keith Bostic) via NetBSD. PR: 5148 Submitted by: Yoshiro MIHIRA <sanpei@yy.cs.keio.ac.jp>	1998-05-20 09:05:48 +00:00
Peter Wemm	fe6c0d4599	Allow control of the attribute cache timeouts at mount time. We had run out of bits in the nfs mount flags, I have moved the internal state flags into a seperate variable. These are no longer visible via statfs(), but I don't know of anything that looks at them.	1998-05-19 07:11:27 +00:00
Bruce Evans	bf57f6f9b3	Get timespecs directly instead of via timevals.	1998-05-16 15:11:24 +00:00
Mike Smith	7be2d30077	In the words of the submitter: --------- Make callers of namei() responsible for releasing references or locks instead of having the underlying filesystems do it. This eliminates redundancy in all terminal filesystems and makes it possible for stacked transport layers such as umapfs or nullfs to operate correctly. Quality testing was done with testvn, and lat_fs from the lmbench suite. Some NFS client testing courtesy of Patrik Kudo. vop_mknod and vop_symlink still release the returned vpp. vop_rename still releases 4 vnode arguments before it returns. These remaining cases will be corrected in the next set of patches. --------- Submitted by: Michael Hancock <michaelh@cet.co.jp>	1998-05-07 04:58:58 +00:00
Poul-Henning Kamp	2f5f6b74ca	Use random() to find our initial xid.	1998-04-06 11:41:07 +00:00
Poul-Henning Kamp	227ee8a188	Eradicate the variable "time" from the kernel, using various measures. "time" wasn't a atomic variable, so splfoo() protection were needed around any access to it, unless you just wanted the seconds part. Most uses of time.tv_sec now uses the new variable time_second instead. gettime() changed to getmicrotime(0. Remove a couple of unneeded splfoo() protections, the new getmicrotime() is atomic, (until Bruce sets a breakpoint in it). A couple of places needed random data, so use read_random() instead of mucking about with time which isn't random. Add a new nfs_curusec() function. Mark a couple of bogosities involving the now disappeard time variable. Update ffs_update() to avoid the weird "== &time" checks, by fixing the one remaining call that passwd &time as args. Change profiling in ncr.c to use ticks instead of time. Resolution is the same. Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call hzto() which subtracts time" sequences. Reviewed by: bde	1998-03-30 09:56:58 +00:00
Eivind Eklund	303b270b0a	Staticize.	1998-02-09 06:11:36 +00:00
Eivind Eklund	0b08f5f737	Back out DIAGNOSTIC changes.	1998-02-06 12:14:30 +00:00
Eivind Eklund	47cfdb166d	Turn DIAGNOSTIC into a new-style option.	1998-02-04 22:34:03 +00:00
Bruce Evans	e7a5897899	Added #include of <sys/queue.h> so that this file is more "self"-sufficent.	1998-02-03 22:19:35 +00:00
Bruce Evans	9cf2c3e77a	Forward declare some structs so that this file is more self-sufficient.	1998-02-03 21:52:02 +00:00
Bruce Evans	bc3de552ad	Moved declaration of `union nethostadr' outside of the KERNEL section, to give pollution compatible with <nfs/nqfs.h>. At least mount_nfs.c previously had to #define KERNEL before including <nfs/nfs.h> to get this pollution, but this gave other pollution. Moved comment about NFSINT_SIGMASK to immediately before the code that it applies to.	1998-02-01 21:23:29 +00:00
John Dyson	eaf13dd73a	Change the busy page mgmt, so that when pages are freed, they MUST be PG_BUSY. It is bogus to free a page that isn't busy, because it is in a state of being "unavailable" when being freed. The additional advantage is that the page_remove code has a better cross-check that the page should be busy and unavailable for other use. There were some minor problems with the collapse code, and this plugs those subtile "holes." Also, the vfs_bio code wasn't checking correctly for PG_BUSY pages. I am going to develop a more consistant scheme for grabbing pages, busy or otherwise. For now, we are stuck with the current morass.	1998-01-31 11:56:53 +00:00
John Dyson	2be70f79f6	Lots of improvements, including restructring the caching and management of vnodes and objects. There are some metadata performance improvements that come along with this. There are also a few prototypes added when the need is noticed. Changes include: 1) Cleaning up vref, vget. 2) Removal of the object cache. 3) Nuke vnode_pager_uncache and friends, because they aren't needed anymore. 4) Correct some missing LK_RETRY's in vn_lock. 5) Correct the page range in the code for msync. Be gentle, and please give me feedback asap.	1997-12-29 00:25:11 +00:00
Bruce Evans	675ea6f083	Unspammed nested include of <vm/vm_zone.h>.	1997-12-27 02:56:39 +00:00
Bruce Evans	3b1e500f27	Added a used include. Fixed a gratuitous ANSIism and nearby KNF violations.	1997-12-20 00:25:01 +00:00
Bruce Evans	638493a3c4	Don't call malloc(..., M_WAITOK) at splnet(). Doing so is often a mistake (since softnet interrupts may occur if malloc() waits), and doing it harmlessly but unnecessarily here interfered with detection of the mistaken cases.	1997-11-24 14:18:00 +00:00
Poul-Henning Kamp	4a11ca4e29	Remove a bunch of variables which were unused both in GENERIC and LINT. Found by: -Wunused	1997-11-07 08:53:44 +00:00
Poul-Henning Kamp	cb226aaa62	Move the "retval" (3rd) parameter from all syscall functions and put it in struct proc instead. This fixes a boatload of compiler warning, and removes a lot of cruft from the sources. I have not removed the /ARGSUSED/, they will require some looking at. libkvm, ps and other userland struct proc frobbing programs will need recompiled.	1997-11-06 19:29:57 +00:00
Bruce Evans	55b211e3af	Removed unused #includes.	1997-10-28 15:59:26 +00:00
Bruce Evans	3b67b033e1	Don't #include <nfs/nfs.h> in <nfs/nfs_node.h> if KERNEL is defined. Fixed everything that depended on the nested include.	1997-10-28 14:06:25 +00:00
Poul-Henning Kamp	5ebdb94a1b	Always initialize the syscall vectors for our "private" syscalls (not just in the LKM case). Plug nqnfs_vop_lease_check directly into the default_vnodeop_p table.	1997-10-26 20:13:52 +00:00
Poul-Henning Kamp	a1c995b626	Last major round (Unless Bruce thinks of somthing :-) of malloc changes. Distribute all but the most fundamental malloc types. This time I also remembered the trick to making things static: Put "static" in front of them. A couple of finer points by: bde	1997-10-12 20:26:33 +00:00
Poul-Henning Kamp	55166637cd	Distribute and statizice a lot of the malloc M_* types. Substantial input from: bde	1997-10-11 18:31:40 +00:00
John Dyson	99448ed11d	Change the M_NAMEI allocations to use the zone allocator. This change plus the previous changes to use the zone allocator decrease the useage of malloc by half. The Zone allocator will be upgradeable to be able to use per CPU-pools, and has more intelligent usage of SPLs. Additionally, it has reasonable stats gathering capabilities, while making most calls inline.	1997-09-21 04:24:27 +00:00
Poul-Henning Kamp	ec1b5c319d	Remove a couple of stubborn NetBSD #if's.	1997-09-10 20:22:32 +00:00
Poul-Henning Kamp	07b2d0aaa3	unifdef -U__NetBSD__ -D__FreeBSD__	1997-09-10 19:52:27 +00:00
Bruce Evans	4d1d4912ae	Added used #include - don't depend on <sys/mbuf.h> including <sys/malloc.h> (unless we only use the bogusly shared M*WAIT flags).	1997-09-02 01:19:47 +00:00
Garrett Wollman	57bf258e3d	Fix all areas of the system (or at least all those in LINT) to avoid storing socket addresses in mbufs. (Socket buffers are the one exception.) A number of kernel APIs needed to get fixed in order to make this happen. Also, fix three protocol families which kept PCBs in mbufs to not malloc them instead. Delete some old compatibility cruft while we're at it, and add some new routines in the in_cksum family.	1997-08-16 19:16:27 +00:00
Bruce Evans	1fd0b0588f	Removed unused #includes.	1997-08-02 14:33:27 +00:00
Doug Rabson	abfbc4005f	Correct some dumb mistakes in the WebNFS stuff. Submitted by: bde	1997-07-22 15:35:57 +00:00
Doug Rabson	c4b3a97040	Allow NULL cookie verifiers for non-NULL offsets. This is needed for Digital Unix boxes since they appear to always send null verifiers.	1997-07-22 15:35:15 +00:00
Doug Rabson	e775608178	Merge WebNFS changes from NetBSD. Obtained from: NetBSD	1997-07-16 09:06:30 +00:00
Tor Egge	932c8934e8	Clear nfs_iodwant[myiod] when the nfsiod process exits due to a signal.	1997-06-25 21:07:26 +00:00
Bruce Evans	f361d28a2c	Don't require superuser privileges for creating fifos. The v2 case was broken when support for v3 was introduced in rev.1.16. The v3 case has always been broken in FreeBSD. Should be in 2.2. PR: 3838	1997-06-14 11:19:35 +00:00
Doug Rabson	7d6b68c4de	Various fixes from NetBSD: Use u_int for rpc procedure numbers. Some fixes to NQNFS. A rare NULL pointer dereference. Ignore NFSMNT_NOCONN for TCP mounts. Obtained from: NetBSD	1997-06-03 17:22:47 +00:00
Doug Rabson	d1e963a50e	Implement the async mount option for NFSv3. This makes NFS pretend that all writes sent to the server were synchronous and therefore no commits are needed. This is the same as the vfs.nfs.async variable on the server but allows each client to choose whether to work this way. Also make the vfs.nfs.async variable do the 'right' thing for NFSv3, i.e. pretend that the write was synchronous.	1997-06-03 13:56:55 +00:00
Doug Rabson	32ad9cb531	Fix a few bugs with NFS and mmap caused by NFS' use of b_validoff and b_validend. The changes to vfs_bio.c are a bit ugly but hopefully can be tidied up later by a slight redesign. PR: kern/2573, kern/2754, kern/3046 (possibly) Reviewed by: dyson	1997-05-19 14:36:56 +00:00
Doug Rabson	cb934d56d1	Don't keep addresses in mbuf chains. This should simplify the next round of network changes from Garret. Reviewed by: Garrett Wollman <wollman@khavrinen.lcs.mit.edu>	1997-05-13 17:25:44 +00:00
Doug Rabson	0160dedc65	Implement a separate control for write gathering on NFSv3. This is turned off for NFSv3 by default since write gathering seems to reduce performance for NFSv3 by up to 60%. Add sysctl knobs to control both variables.	1997-05-10 16:59:36 +00:00
Doug Rabson	5ae0f71815	Fix a nasty hang connected with write gathering. Also add debug print statements to bits of the server which helped me find the hang.	1997-05-10 16:12:03 +00:00
Doug Rabson	6382d3ad84	Allow NULL rpcs on non-privileged ports at all times to work around broken clients. PR: kern/3298 Submitted by: Tor Egge <Tor.Egge@idi.ntnu.no>	1997-04-30 09:51:37 +00:00
Garrett Wollman	a29f300e80	The long-awaited mega-massive-network-code- cleanup. Part I. This commit includes the following changes: 1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility glue for them is deleted, and the kernel will panic on boot if any are compiled in. 2) Certain protocol entry points are modified to take a process structure, so they they can easily tell whether or not it is possible to sleep, and also to access credentials. 3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt() call. Protocols should use the process pointer they are now passed. 4) The PF_LOCAL and PF_ROUTE families have been updated to use the new style, as has the `raw' skeleton family. 5) PF_LOCAL sockets now obey the process's umask when creating a socket in the filesystem. As a result, LINT is now broken. I'm hoping that some enterprising hacker with a bit more time will either make the broken bits work (should be easy for netipx) or dike them out.	1997-04-27 20:01:29 +00:00
Doug Rabson	9aa2858d44	Fix broken usage of nm_readdirsize and increase the socket buffers for UDP to prevent possible socket overflows. 2.2 candidate. PR: kern/3304 Reviewed by: Thomas David Rivers <ponds!rivers@dg-rtp.dg.com>	1997-04-22 17:38:01 +00:00
Doug Rabson	4ba14e3a10	Fix various bugs in the locking protocol, allowing proper shared locks to be used. This should fix the lock panics that people are seeing.	1997-04-04 17:49:35 +00:00
Bruce Evans	b445591810	Removed #include of <ufs/ufs/dir.h>. Nfs no longer depends on any ufs features, and the one thing that it depended on (DIRBLKSIZ) now has conflicting spelling.	1997-03-29 12:40:20 +00:00
Bruce Evans	00780cef44	Define our own version of DIRBLKSIZ instead of (ab)using ufs's value. Use the same value of 512 (ufs actually uses DEV_BSIZE). There are too many versions of DIRBLKSIZ, one for ufs, one for ext2fs, one for nfs, one for ibcs2, one for linux, one for applications, ... I think nfs's DIRBLKSIZ needs to be a divisor of the directory blocks sizes of all supported file systems. There is also NFS_DIRBLKSIZ, which is different from nfs's DIRBLKSIZ but is sometimes confused with it in comments. Removed a bogus #ifdef KERNEL that hid the tunable constants for nfs. This came in undocumented with the Lite2 merge although it isn't in Lite2. It required more-bogus #define KERNEL's in fstat and pstat to make the constants visible. Restored a spelling fix from rev.1.17. Removed duplicate #defines of all the the NFS mount option flags.	1997-03-29 12:34:33 +00:00
Guido van Rooij	394da4c167	Add code that will reject nfs requests in teh kernel from nonprivileged ports. This option will be automatically set/cleraed when mount is run without/with the -n option. Reviewed by: Doug Rabson	1997-03-27 20:01:07 +00:00
Peter Wemm	476b25e22e	Use the correct (relative to the implementation) ordering of args in the VOP_LINK() calls, Closes PR#3064 Submitted by: bde	1997-03-25 05:13:40 +00:00
Peter Wemm	289c56e81e	The local fs interface does not allow link()/unlink() of directories, do not allow a remote nfs client to cause local fs corruption either.	1997-03-25 05:08:28 +00:00
Bruce Evans	3c81694426	Fixed some invalid (non-atomic) accesses to `time', mostly ones of the form `tv = time'. Use a new function gettime(). The current version just forces atomicicity without fixing precision or efficiency bugs. Simplified some related valid accesses by using the central function.	1997-03-22 06:53:45 +00:00
Peter Wemm	6875d25465	Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.	1997-02-22 09:48:43 +00:00
John Dyson	996c772f58	This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>	1997-02-10 02:22:35 +00:00
Jordan K. Hubbard	1130b656e5	Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.	1997-01-14 07:20:47 +00:00
Doug Rabson	f438ae02f5	Improve the queuing algorithms used by NFS' asynchronous i/o. The existing mechanism uses a global queue for some buffers and the vp->b_dirtyblkhd queue for others. This turns sequential writes into randomly ordered writes to the server, affecting both read and write performance. The existing mechanism also copes badly with hung servers, tending to block accesses to other servers when all the iods are waiting for a hung server. The new mechanism uses a queue for each mount point. All asynchronous i/o goes through this queue which preserves the ordering of requests. A simple mechanism ensures that the iods are shared out fairly between active mount points. This removes the sysctl variable vfs.nfs.dwrite since the new queueing mechanism removes the old delayed write code completely. This should go into the 2.2 branch.	1996-11-06 10:53:16 +00:00
Doug Rabson	f31dba4c5d	This fixes a problem with the nfs socket handling code which happens if a single process is performing a large number of requests (in this case writing a large file). The writing process could monopolise the recieve lock and prevent any other processes from recieving their replies. It also adds a new sysctl variable 'vfs.nfs.dwrite' which controls the behaviour which originally pointed out the problem. When a process writes to a file over NFS, it usually arranges for another process (the 'iod') to perform the request. If no iods are available, then it turns the write into a 'delayed write' which is later picked up by the next iod to do a write request for that file. This can cause that particular iod to do a disproportionate number of requests from a single process which can harm performance on some NFS servers. The alternative is to perform the write synchronously in the context of the original writing process if no iod is avaiable for asynchronous writing. The 'delayed write' behaviour is selected when vfs.nfs.dwrite=1 and the non-delayed behaviour is selected when vfs.nfs.dwrite=0. The default is vfs.nfs.dwrite=1; if many people tell me that performance is better if vfs.nfs.dwrite=0 then I will change the default. Submitted by: Hidetoshi Shimokawa <simokawa@sat.t.u-tokyo.ac.jp>	1996-10-11 10:15:33 +00:00
Nate Williams	030e2e9ebb	In sys/time.h, struct timespec is defined as: /* * Structure defined by POSIX.4 to be like a timeval. / struct timespec { time_t ts_sec; / seconds / long ts_nsec; / and nanoseconds */ }; The correct names of the fields are tv_sec and tv_nsec. Reminded by: James Drobina <jdrobina@infinet.com>	1996-09-19 18:21:32 +00:00
David Greenman	0247363f69	Release an unneeded reference to a vnode that was gained in a VFS_VGET(). Fixes a readdirplus panic. Submitted by: Doug Rabson <dfr@render.com>	1996-09-05 07:58:04 +00:00
Bruce Evans	b71fec07db	Eliminated nested include of <sys/unistd.h> in <sys/file.h> in the kernel. Include it directly in the few places where it is used. Reduced some #includes of <sys/file.h> to #includes of <sys/fcntl.h> or nothing.	1996-09-03 14:25:27 +00:00
John Dyson	6476c0d204	Even though this looks like it, this is not a complex code change. The interface into the "VMIO" system has changed to be more consistant and robust. Essentially, it is now no longer necessary to call vn_open to get merged VM/Buffer cache operation, and exceptional conditions such as merged operation of VBLK devices is simpler and more correct. This code corrects a potentially large set of problems including the problems with ktrace output and loaded systems, file create/deletes, etc. Most of the changes to NFS are cosmetic and name changes, eliminating a layer of subroutine calls. The direct calls to vput/vrele have been re-instituted for better cross platform compatibility. Reviewed by: davidg	1996-08-21 21:56:23 +00:00
Doug Rabson	09c6884729	Various fixes from frank@fwi.uva.nl (Frank van der Linden) via rick@snowhite.cis.uoguelph.ca: 1. Clear B_NEEDCOMMIT in nfs_write to make sure that dirty data is correctly send to the server. If a buffer was dirtied when it was in the B_DELWRI+B_NEEDCOMMIT state, the state of the buffer was left unchanged and when the buffer was later cleaned, just a commit rpc was made to the server to complete the previous write. Clearing B_NEEDCOMMIT ensures that another write is made to the server. 2. If a server returned a server (for whatever reason) returned an answer to a write RPC that implied that fewer bytes than requested were written, bad things would happen. 3. The setattr operation passed on the atime in stead of the mtime to the server. The fix is trivial. 4. XIDs always started at 0, but this caused some servers (older DEC OSF/1 3.0 so I've been told) who had very long-lasting XID caches to get confused if, after a reboot of a BSD client, RPCs came in with a XID that had in the past been used before from that client. Patch is to use the current time in seconds as a starting point for XIDs. The patch below is not perfect, because it requires the root fs to be mounted first. This is because of the check BSD systems do, comparing FS time to system time. Reviewed by: Bruce Evans, Terry Lambert. Obtained from: frank@fwi.uva.nl (Frank van der Linden) via rick@snowhite.cis.uoguelph.ca	1996-07-16 10:19:45 +00:00
Garrett Wollman	2c37256e5a	Modify the kernel to use the new pr_usrreqs interface rather than the old pr_usrreq mechanism which was poorly designed and error-prone. This commit renames pr_usrreq to pr_ousrreq so that old code which depended on it would break in an obvious manner. This commit also implements the new interface for TCP, although the old function is left as an example (#ifdef'ed out). This commit ALSO fixes a longstanding bug in the TCP timer processing (introduced by davidg on 1995/04/12) which caused timer processing on a TCB to always stop after a single timer had expired (because it misinterpreted the return value from tcp_usrreq() to indicate that the TCB had been deleted). Finally, some code related to polling has been deleted from if.c because it is not relevant t -current and doesn't look at all like my current code.	1996-07-11 16:32:50 +00:00
Bruce Evans	8cd5acbce0	Don't truncate minor or major numbers in the nfsv3 client.	1996-06-23 17:19:25 +00:00
Poul-Henning Kamp	5b28a6011f	Fix for NFS_NOSERVER Poul mentioned that he thought this was some kind of timing problem, and that started me thinking. After a little poking around, I found that nfs_timer() was completely disabled when NFS_NOSERVER was #defined. But after looking at nfs_timer(), it seemed like it was something required by both the client and server code, and disabling it outright just didn't seem to make any sense. Parts of it relate only to the NFS server side code, so I disabled those, but I re-enabled the rest of the function and made sure that it would be called from nfs_init() (in nfs_subs.c). With nfs_timer() re-enabled, everything seems to work again. The only other changes I made were to #ifdef away some variable declarations in the NFS_NOSERVER case so that gcc would stop complaining about unused variables. Reviewed by: phk Submitted by: Bill Paul <wpaul@skynet.ctr.columbia.edu>	1996-06-14 11:13:21 +00:00
Bruce Evans	39a3579443	Fixed a vnode reference leak in nfsrv_rename(). The target inode wasn't released until the file system was unmounted. This bug also affected kern/vfs_syscalls.c but was fixed in rev.1.18 and rev.1.20 there. Reviewed by: davidg	1996-06-08 12:16:26 +00:00
Bruce Evans	71d96b71c2	#include <sys/filedesc.h> explicitly instead of depending on it being bogusly included by <sys/socketvar.h>.	1996-04-30 23:26:52 +00:00
Bruce Evans	21e0797227	Fixed nfs sysctls. They missed out on the fs -> vfs name changes from Lite2. This broke nfsstat.	1996-04-30 23:23:09 +00:00
Garrett Wollman	dc915e7cfc	Kill XNS. While we're at it, fix socreate() to take a process argument. (This was supposed to get committed days ago...)	1996-02-13 18:16:31 +00:00
Mike Pritchard	6c5e9bbdf5	Fix a bunch of spelling errors in the comment fields of a bunch of system include files.	1996-01-30 23:02:38 +00:00
John Dyson	bd7e5f992e	Eliminated many redundant vm_map_lookup operations for vm_mmap. Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish overhead for merged cache. Efficiency improvement for vfs_cluster. It used to do alot of redundant calls to cluster_rbuild. Correct the ordering for vrele of .text and release of credentials. Use the selective tlb update for 486/586/P6. Numerous fixes to the size of objects allocated for files. Additionally, fixes in the various pagers. Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs. Fixes in the swap pager for exhausted resources. The pageout code will not as readily thrash. Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE), thereby improving efficiency of several routines. Eliminate even more unnecessary vm_page_protect operations. Significantly speed up process forks. Make vm_object_page_clean more efficient, thereby eliminating the pause that happens every 30seconds. Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the case of filesystems mounted async. Fix a panic with busy pages when write clustering is done for non-VMIO buffers.	1996-01-19 04:00:31 +00:00
Poul-Henning Kamp	99cb299316	Add an option NFS_NOSERVER which saves 100K in the install kernel (or any other kernel that uses it). Use with option NFS.	1996-01-13 23:27:58 +00:00
Poul-Henning Kamp	b8dce649f1	Staticize.	1995-12-17 21:14:36 +00:00
David Greenman	efeaf95a41	Untangled the vm.h include file spaghetti.	1995-12-07 12:48:31 +00:00
Bruce Evans	dee6b0ab68	Completed function declarations and/or added prototypes and/or moved prototypes to the right place.	1995-12-03 10:03:12 +00:00
Bruce Evans	55054f3540	Completed function declarations, added prototypes and removed redundant declarations.	1995-11-21 15:51:39 +00:00
Bruce Evans	512fef80a9	Completed function declarations and/or added prototypes.	1995-11-21 12:55:26 +00:00
Bruce Evans	e4f937b07b	Included <sys/sysproto.h> to get central declarations for syscall args structs and prototypes for syscalls. Ifdefed duplicated decentralized declarations of args structs. It's convenient to have this visible but they are hard to maintain. Some are already different from the central declarations. 4.4lite2 puts them in comments in the function headers but I wanted to avoid the large changes for that.	1995-11-14 05:16:37 +00:00
Joerg Wunsch	e046098fa9	Include a prerequisite header (so this is consistent again with the NFSv2 state).	1995-10-31 21:17:59 +00:00
Poul-Henning Kamp	a98ca4699e	Second batch of cleanup changes. This time mostly making a lot of things static and some unused variables here and there.	1995-10-29 15:33:36 +00:00
David Greenman	c0c06a67d2	Added NFS_ASYNC kernel option. It only has an effect for NFSv2.	1995-08-24 11:39:31 +00:00
David Greenman	dcc84850a7	Killed redundant declarations of nfsm_rpchead().	1995-08-24 11:04:04 +00:00
Doug Rabson	c3b2cc769c	Some fixes found using gcc -Wall: nfsm_rpchead() has been called with the wrong number of args and misplaced args since someone added new args in the middle for nfsv3. Here's another one that would be important on 64-bit systems. VOP_READDIR takes a `u_int **cookies' arg. Submitted by: Bruce Evans <bde@zeta.org.au>	1995-08-24 10:45:16 +00:00
Doug Rabson	27df97742b	Add support for amd direct maps. Reviewed by: Thomas Graichen <graichen@sirius.physik.fu-berlin.de>	1995-08-24 10:17:39 +00:00
David Greenman	75d8591e04	Fixed bug where vnode_pager_uncache() wasn't always called when it should be. The result was that the file's space wouldn't be properly freed when it was deleted. Submitted by: John Dyson	1995-08-06 11:55:25 +00:00
Doug Rabson	7faccad982	Slight changes to locking around VOP_READRIR. Detect in nfsrv_readdirplus when a filesystem soes not support VFS_VGET and return NFSERR_NOTSUPP so that the client will use ordinary readdir instead.	1995-08-03 12:14:16 +00:00
Doug Rabson	a2c06d4685	Lock the directory vnode before VOP_READDIR in nfsrv_readdirplus	1995-08-02 10:12:47 +00:00
David Greenman	4777741358	Removed my special-case hack for VOP_LINK and fixed the problem with the wrong vp's ops vector being used by changing the VOP_LINK's argument order. The special-case hack doesn't go far enough and breaks the generic bypass routine used in some non-leaf filesystems. Pointed out by Kirk McKusick.	1995-08-01 18:51:02 +00:00
Bruce Evans	28f8db1403	Eliminate sloppy common-style declarations. There should be none left for the LINT configuation.	1995-07-29 11:44:31 +00:00
David Greenman	24aa09cd4f	vnode_pager_alloc() never returns NULL, so don't check for it.	1995-07-20 09:43:12 +00:00
David Greenman	24a1cce34f	NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct proc or any VM system structure will have to be rebuilt!!! Much needed overhaul of the VM system. Included in this first round of changes: 1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages, haspage, and sync operations are supported. The haspage interface now provides information about clusterability. All pager routines now take struct vm_object's instead of "pagers". 2) Improved data structures. In the previous paradigm, there is constant confusion caused by pagers being both a data structure ("allocate a pager") and a collection of routines. The idea of a pager structure has escentially been eliminated. Objects now have types, and this type is used to index the appropriate pager. In most cases, items in the pager structure were duplicated in the object data structure and thus were unnecessary. In the few cases that remained, a un_pager structure union was created in the object to contain these items. 3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now be removed. For instance, vm_object_enter(), vm_object_lookup(), vm_object_remove(), and the associated object hash list were some of the things that were removed. 4) simple_lock's removed. Discussion with several people reveals that the SMP locking primitives used in the VM system aren't likely the mechanism that we'll be adopting. Even if it were, the locking that was in the code was very inadequate and would have to be mostly re-done anyway. The locking in a uni-processor kernel was a no-op but went a long way toward making the code difficult to read and debug. 5) Places that attempted to kludge-up the fact that we don't have kernel thread support have been fixed to reflect the reality that we are really dealing with processes, not threads. The VM system didn't have complete thread support, so the comments and mis-named routines were just wrong. We now use tsleep and wakeup directly in the lock routines, for instance. 6) Where appropriate, the pagers have been improved, especially in the pager_alloc routines. Most of the pager_allocs have been rewritten and are now faster and easier to maintain. 7) The pagedaemon pageout clustering algorithm has been rewritten and now tries harder to output an even number of pages before and after the requested page. This is sort of the reverse of the ideal pagein algorithm and should provide better overall performance. 8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup have been removed. Some other unnecessary casts have also been removed. 9) Some almost useless debugging code removed. 10) Terminology of shadow objects vs. backing objects straightened out. The fact that the vm_object data structure escentially had this backwards really confused things. The use of "shadow" and "backing object" throughout the code is now internally consistent and correct in the Mach terminology. 11) Several minor bug fixes, including one in the vm daemon that caused 0 RSS objects to not get purged as intended. 12) A "default pager" has now been created which cleans up the transition of objects to the "swap" type. The previous checks throughout the code for swp->pg_data != NULL were really ugly. This change also provides the rudiments for future backing of "anonymous" memory by something other than the swap pager (via the vnode pager, for example), and it allows the decision about which of these pagers to use to be made dynamically (although will need some additional decision code to do this, of course). 13) (dyson) MAP_COPY has been deprecated and the corresponding "copy object" code has been removed. MAP_COPY was undocumented and non- standard. It was furthermore broken in several ways which caused its behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will continue to work correctly, but via the slightly different semantics of MAP_PRIVATE. 14) (dyson) Sharing maps have been removed. It's marginal usefulness in a threads design can be worked around in other ways. Both #12 and #13 were done to simplify the code and improve readability and maintain- ability. (As were most all of these changes) TODO: 1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing this will reduce the vnode pager to a mere fraction of its current size. 2) Rewrite vm_fault and the swap/vnode pagers to use the clustering information provided by the new haspage pager interface. This will substantially reduce the overhead by eliminating a large number of VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be improved to provide both a "behind" and "ahead" indication of contiguousness. 3) Implement the extended features of pager_haspage in swap_pager_haspage(). It currently just says 0 pages ahead/behind. 4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps via a much more general mechanism that could also be used for disk striping of regular filesystems. 5) Do something to improve the architecture of vm_object_collapse(). The fact that it makes calls into the swap pager and knows too much about how the swap pager operates really bothers me. It also doesn't allow for collapsing of non-swap pager objects ("unnamed" objects backed by other pagers).	1995-07-13 08:48:48 +00:00
David Greenman	06cb725951	Moved call to VOP_GETATTR() out of vnode_pager_alloc() and into the places that call vnode_pager_alloc() so that a failure return can be dealt with. This fixes a panic seen on NFS clients when a file being opened is deleted on the server before the open completes.	1995-07-09 06:58:03 +00:00
David Greenman	aa2cabb958	1) Converted v_vmdata to v_object. 2) Removed unnecessary vm_object_lookup()/pager_cache(object, TRUE) pairs after vnode_pager_alloc() calls - the object is already guaranteed to be persistent. 3) Removed some gratuitous casts.	1995-06-28 12:01:13 +00:00
David Greenman	9879652657	Fixed VOP_LINK argument order botch.	1995-06-28 07:06:55 +00:00
Doug Rabson	a62dc40654	Changes to support version 3 of the NFS protocol. The version 2 support has been tested (client+server) against FreeBSD-2.0, IRIX 5.3 and FreeBSD-current (using a loopback mount). The version 2 support is stable AFAIK. The version 3 support has been tested with a loopback mount and minimally against an IRIX 5.3 server. It needs more testing and may have problems. I have patched amd to support the new variable length filehandles although it will still only use version 2 of the protocol. Before booting a kernel with these changes, nfs clients will need to at least build and install /usr/sbin/mount_nfs. Servers will need to build and install /usr/sbin/mountd. NFS diskless support is untested. Obtained from: Rick Macklem <rick@snowhite.cis.uoguelph.ca>	1995-06-27 11:07:30 +00:00
Joerg Wunsch	ec05e1f5d7	The duplicate information returned in fa_type and fa_mode is an ambiguity in the NFS version 2 protocol. VREG should be taken literally as a regular file. If a server intents to return some type information differently in the upper bits of the mode field (e.g. for sockets, or FIFOs), NFSv2 mandates fa_type to be VNON. Anyway, we leave the examination of the mode bits even in the VREG case to avoid breakage for bogus servers, but we make sure that there are actually type bits set in the upper part of fa_mode (and failing that, trust the va_type field). NFSv3 cleared the issue, and requires fa_mode to not contain any type information (while also introduing sockets and FIFOs for fa_type). The fix has been tested against a variety of NFS servers. It fixes problems with the ``Tropic'' NFS server for Windows, while apparently not breaking anything. Pointed-out by: scott@zorch.sf-bay.org (Scott Hazen Mueller)	1995-06-14 06:23:38 +00:00
Rodney W. Grimes	d3628763db	Merge RELENG_2_0_5 into HEAD	1995-06-11 19:33:05 +00:00
Rodney W. Grimes	9b2e535452	Remove trailing whitespace.	1995-05-30 08:16:23 +00:00
David Greenman	77f53bcf27	Fixed some serious bugs that resulted in object reference counts not being handled correctly. This would manifest itself as "object deallocated too many times" panics and perhaps other strange inconsistencies on NFS servers. Reviewed by: me, of course Submitted by: John Dyson	1995-05-29 04:01:09 +00:00
John Dyson	e0a4d029a5	Slight re-ordering of the creation of a vmio object to fix a condition that can cause NFS I/O failures.	1995-04-21 02:58:49 +00:00
David Greenman	50475e8bd3	Removed unnecessary call to vnode_pager_uncache(). We automatically clear the VTEXT flag after all mappers have finished with the object.	1995-03-19 12:08:03 +00:00
David Greenman	45e3e324cc	Changed some (incorrect) nfsrv_vput()'s back into regular vput()'s. This fixes the last of the known NQNFS problems (until I find more, that is :-)).	1995-03-17 07:45:19 +00:00
Bruce Evans	b5e8ce9f12	Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.	1995-03-16 18:17:34 +00:00
Poul-Henning Kamp	78ff637a2c	YF fix.	1995-02-15 04:21:32 +00:00
David Greenman	efefea024a	Fixed two more bugs related to the merged cache changes. Submitted by: John Dyson	1995-02-15 03:40:00 +00:00
David Greenman	ad21d87fd5	Woops, change a nfsrv_vput back into a nfsrv_vrele. Submitted by: John Dyson	1995-02-15 03:38:12 +00:00
David Greenman	6b03a7ffc5	Fixed three bugs related to the merged cache changes. The bugs likely would make NFS servers flakey - probably the cause of freefall's recent hangs. Submitted by: John Dyson	1995-02-15 03:03:03 +00:00
Poul-Henning Kamp	473e9734a8	YFfix +int nfsrv_vput __P(( struct vnode * )); +int nfsrv_vrele __P(( struct vnode * )); +int nfsrv_vmio __P(( struct vnode * ));	1995-02-14 06:22:18 +00:00
David Greenman	081129c5e3	Changed order of release of vnode/object to fix a problem where the vnode is freed with an old object still attached (subsequently causing a panic). Fixes NFS server panic "object/pager mismatch". Submitted by: John Dyson	1995-02-06 02:20:40 +00:00
David Greenman	0d94caffca	These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D. The majority of the merged VM/cache work is by John Dyson. The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme. vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering. vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff. vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption. vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up. vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme. pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs. vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping. proc.h Fixed the problem that the p_lock flag was not being cleared on a fork. swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore. machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme. machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed. ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers. Submitted by: John Dyson and David Greenman	1995-01-09 16:06:02 +00:00
Garrett Wollman	d9e91095ab	Forward-declare a few structures to avoid warning messages.	1994-11-02 00:11:00 +00:00
Garrett Wollman	b43e29afed	Implement fs.nfs MIB variables.	1994-10-23 23:26:18 +00:00
Poul-Henning Kamp	6ae324074a	This is a bunch of changes from NetBSD. There are a couple of bug-fixes. But mostly it is changes to use the list-maintenance macros instead of doing the pointer-gymnastics by hand. Obtained from: NetBSD	1994-10-17 17:47:45 +00:00
Poul-Henning Kamp	48fbb6cc7e	Prototyping and general gcc-shutting up. Gcc has one warning now which looks bad, I will get to it eventually, unless somebody beats me to it.	1994-10-02 17:27:07 +00:00
Doug Rabson	9abf4d6ee0	Make NFS ask the filesystems for directory cookies instead of making them itself.	1994-09-28 16:45:22 +00:00
Garrett Wollman	e21fa31a8e	Make NFS loadable.	1994-09-22 22:10:49 +00:00

... 2 3 4 5 6 ...

357 Commits