Commit Graph

524 Commits

Author SHA1 Message Date
Doug Rabson
6382d3ad84 Allow NULL rpcs on non-privileged ports at all times to work around broken
clients.

PR:		kern/3298
Submitted by:	Tor Egge <Tor.Egge@idi.ntnu.no>
1997-04-30 09:51:37 +00:00
Garrett Wollman
a29f300e80 The long-awaited mega-massive-network-code- cleanup. Part I.
This commit includes the following changes:
1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility
glue for them is deleted, and the kernel will panic on boot if any are compiled
in.

2) Certain protocol entry points are modified to take a process structure,
so they they can easily tell whether or not it is possible to sleep, and
also to access credentials.

3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt()
call.  Protocols should use the process pointer they are now passed.

4) The PF_LOCAL and PF_ROUTE families have been updated to use the new
style, as has the `raw' skeleton family.

5) PF_LOCAL sockets now obey the process's umask when creating a socket
in the filesystem.

As a result, LINT is now broken.  I'm hoping that some enterprising hacker
with a bit more time will either make the broken bits work (should be
easy for netipx) or dike them out.
1997-04-27 20:01:29 +00:00
Doug Rabson
9aa2858d44 Fix broken usage of nm_readdirsize and increase the socket buffers for UDP
to prevent possible socket overflows.

2.2 candidate.

PR:		kern/3304
Reviewed by:	Thomas David Rivers <ponds!rivers@dg-rtp.dg.com>
1997-04-22 17:38:01 +00:00
Doug Rabson
baaf1d96f0 Fix a bug where a program which appended many small records to a file could
wind up writing zeros instead of real data when the file is on an NFSv2
mounted directory.

While tracking this bug down, I noticed that nfs_asyncio was waking *all*
the iods when a block was written instead of just one per block.  Fixing this
gives a 25% performance improvment for writes on v2 (less for v3).

Both are 2.2 candidates.

PR:		kern/2774
1997-04-19 14:28:36 +00:00
Doug Rabson
18cab10cb3 Don't allow partial buffers to be cluster-comitted.
Zero the b_dirty{off,end} after cluster-comitting a group of buffers.

With these fixes, I was able to complete a 'make world' with remote src
and obj directories.
1997-04-18 14:12:17 +00:00
Doug Rabson
4ba14e3a10 Fix various bugs in the locking protocol, allowing proper shared locks
to be used.  This should fix the lock panics that people are seeing.
1997-04-04 17:49:35 +00:00
Doug Rabson
363880128c The code which recovered from a modified directory situation did not check
for eof when re-caching the directory.  This could cause it to loop forever
if a directory was truncated.
1997-04-03 07:52:00 +00:00
Bruce Evans
b445591810 Removed #include of <ufs/ufs/dir.h>. Nfs no longer depends on any ufs
features, and the one thing that it depended on (DIRBLKSIZ) now has
conflicting spelling.
1997-03-29 12:40:20 +00:00
Bruce Evans
00780cef44 Define our own version of DIRBLKSIZ instead of (ab)using ufs's value.
Use the same value of 512 (ufs actually uses DEV_BSIZE).  There are
too many versions of DIRBLKSIZ, one for ufs, one for ext2fs, one for
nfs, one for ibcs2, one for linux, one for applications, ... I think
nfs's DIRBLKSIZ needs to be a divisor of the directory blocks sizes
of all supported file systems.  There is also NFS_DIRBLKSIZ, which is
different from nfs's DIRBLKSIZ but is sometimes confused with it in
comments.

Removed a bogus #ifdef KERNEL that hid the tunable constants for nfs.
This came in undocumented with the Lite2 merge although it isn't in
Lite2.  It required more-bogus #define KERNEL's in fstat and pstat
to make the constants visible.

Restored a spelling fix from rev.1.17.

Removed duplicate #defines of all the the NFS mount option flags.
1997-03-29 12:34:33 +00:00
Guido van Rooij
394da4c167 Add code that will reject nfs requests in teh kernel from nonprivileged
ports. This option will be automatically set/cleraed when mount is run
without/with the -n option.
Reviewed by:	Doug Rabson
1997-03-27 20:01:07 +00:00
Bruce Evans
51a534883a Don't include <sys/ioctl.h> in the kernel. Stage 2: include
<sys/sockio.h> instead of <sys/ioctl.h> in network files.
1997-03-24 11:33:46 +00:00
Bruce Evans
3c81694426 Fixed some invalid (non-atomic) accesses to `time', mostly ones of the
form `tv = time'.  Use a new function gettime().  The current version
just forces atomicicity without fixing precision or efficiency bugs.
Simplified some related valid accesses by using the central function.
1997-03-22 06:53:45 +00:00
Bruce Evans
7eff94279a YAMInTheWrongDirectionF22 (part of rev.1.28.2.3: set B_CLUSTEROK for
commits).
1997-03-09 10:21:26 +00:00
Bruce Evans
2ca8d13195 Fixed a panic in nfs_writevp(). Lite2 provided a fix for a silly
missing-parentheses bug, but this exposed a misplaced vfs_busy_pages().
This bug cost a factor of 2.5-3 in nfsv3 write performance!  It should
be fixed in 2.2.

Removed some debugging code that gets triggered often in normal
operation.  There are still many backwards diagnostics (#define
DIAGNOSTIC gives no diagnostics).

Submitted by:	vfs_busy_pages() fix by dfr
1997-02-28 17:56:27 +00:00
Peter Wemm
6875d25465 Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
1997-02-22 09:48:43 +00:00
Bruce Evans
bd6d39b941 Changed #ifdef COMPAT_PRELITE2' to #ifndef NO_COMPAT_PRELITE2' so that
old nfs mount calls are supported by default.
1997-02-18 04:40:38 +00:00
John Dyson
996c772f58 This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
		Mount_std mounts will not work until the getfsent
		library routine is changed.

Reviewed by:	various people
Submitted by:	Jeffery Hsu <hsu@freebsd.org>
1997-02-10 02:22:35 +00:00
Jordan K. Hubbard
1130b656e5 Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
Bill Paul
1b4a7d506f Fix (properly, I hope) 'panic: sillyrename dir' crash that can happen
if you do:

% cd /nfsdir
% mkdir -p foo/foo
% mv foo/foo .

nfs_sillyrename() self-destructs if you try to sillyrename a directory,
however nfs_rename() can be coerced into doing just that by the above
sequence of commands. To avoid this, nfs_rename() now checks that
v_type of the 'destination' vnode != VDIR before attempting the
sillyrename. The server correctly handles this particular situation
by returning ENOTEMPTY on the rename() attempt.

I asked if this was the correct fix for this on -hackers but nobody
ever answered.

This is a 2.2 candidate.
1996-12-31 07:10:19 +00:00
Garrett Wollman
59562606b9 Convert the interface address and IP interface address structures
to TAILQs.  Fix places which referenced these for no good reason
that I can see (the references remain, but were fixed to compile
again; they are still questionable).
1996-12-13 21:29:07 +00:00
Doug Rabson
f438ae02f5 Improve the queuing algorithms used by NFS' asynchronous i/o. The
existing mechanism uses a global queue for some buffers and the
vp->b_dirtyblkhd queue for others.  This turns sequential writes into
randomly ordered writes to the server, affecting both read and write
performance.  The existing mechanism also copes badly with hung
servers, tending to block accesses to other servers when all the iods
are waiting for a hung server.

The new mechanism uses a queue for each mount point.  All asynchronous
i/o goes through this queue which preserves the ordering of requests.
A simple mechanism ensures that the iods are shared out fairly between
active mount points.  This removes the sysctl variable vfs.nfs.dwrite
since the new queueing mechanism removes the old delayed write code
completely.

This should go into the 2.2 branch.
1996-11-06 10:53:16 +00:00
Doug Rabson
425b5191a4 If a large (>4096 bytes) directory was modified, the old directory
contents are discarded, including the cached seek cookies.
Unfortunately, if the directory was larger than NFS_DIRBLKSIZ, then
this confused nfs_readdirrpc(), making it appear as if the directory
was truncated.

Reviewed by:	Karl Denninger <karl@Mcs.Net>
1996-10-21 10:07:52 +00:00
Poul-Henning Kamp
8e774bbf9d Add four sysctl variables that joerg wanted. 1996-10-20 15:01:58 +00:00
Bruce Evans
e07fd62c16 Staticized `nfs_dwrite'. 1996-10-12 17:39:39 +00:00
Doug Rabson
f31dba4c5d This fixes a problem with the nfs socket handling code which happens
if a single process is performing a large number of requests (in this
case writing a large file).  The writing process could monopolise the
recieve lock and prevent any other processes from recieving their
replies.

It also adds a new sysctl variable 'vfs.nfs.dwrite' which controls the
behaviour which originally pointed out the problem.  When a process
writes to a file over NFS, it usually arranges for another process
(the 'iod') to perform the request.  If no iods are available, then it
turns the write into a 'delayed write' which is later picked up by the
next iod to do a write request for that file.  This can cause that
particular iod to do a disproportionate number of requests from a
single process which can harm performance on some NFS servers.  The
alternative is to perform the write synchronously in the context of
the original writing process if no iod is avaiable for asynchronous
writing.

The 'delayed write' behaviour is selected when vfs.nfs.dwrite=1 and
the non-delayed behaviour is selected when vfs.nfs.dwrite=0.  The
default is vfs.nfs.dwrite=1; if many people tell me that performance
is better if vfs.nfs.dwrite=0 then I will change the default.

Submitted by:	Hidetoshi Shimokawa <simokawa@sat.t.u-tokyo.ac.jp>
1996-10-11 10:15:33 +00:00
Nate Williams
030e2e9ebb In sys/time.h, struct timespec is defined as:
/*
         * Structure defined by POSIX.4 to be like a timeval.
         */
        struct timespec {
                time_t  ts_sec;         /* seconds */
                long    ts_nsec;        /* and nanoseconds */
        };

        The correct names of the fields are tv_sec and tv_nsec.

Reminded by:	James Drobina <jdrobina@infinet.com>
1996-09-19 18:21:32 +00:00
John Dyson
6476c0d204 Even though this looks like it, this is not a complex code change.
The interface into the "VMIO" system has changed to be more consistant
and robust.  Essentially, it is now no longer necessary to call vn_open
to get merged VM/Buffer cache operation, and exceptional conditions
such as merged operation of VBLK devices is simpler and more correct.

This code corrects a potentially large set of problems including the
problems with ktrace output and loaded systems, file create/deletes,
etc.

Most of the changes to NFS are cosmetic and name changes, eliminating
a layer of subroutine calls.  The direct calls to vput/vrele have
been re-instituted for better cross platform compatibility.

Reviewed by: davidg
1996-08-21 21:56:23 +00:00
Doug Rabson
09c6884729 Various fixes from frank@fwi.uva.nl (Frank van der Linden) via
rick@snowhite.cis.uoguelph.ca:

1. Clear B_NEEDCOMMIT in nfs_write to make sure that dirty data is
correctly send to the server.  If a buffer was dirtied when it was in
the B_DELWRI+B_NEEDCOMMIT state, the state of the buffer was left
unchanged and when the buffer was later cleaned, just a commit rpc was
made to the server to complete the previous write.  Clearing
B_NEEDCOMMIT ensures that another write is made to the server.

2. If a server returned a server (for whatever reason) returned an
answer to a write RPC that implied that fewer bytes than requested
were written, bad things would happen.

3. The setattr operation passed on the atime in stead of the mtime to
the server. The fix is trivial.

4. XIDs always started at 0, but this caused some servers (older DEC
OSF/1 3.0 so I've been told) who had very long-lasting XID caches to
get confused if, after a reboot of a BSD client, RPCs came in with a
XID that had in the past been used before from that client. Patch is
to use the current time in seconds as a starting point for XIDs. The
patch below is not perfect, because it requires the root fs to be
mounted first. This is because of the check BSD systems do, comparing
FS time to system time.

Reviewed by:	Bruce Evans, Terry Lambert.
Obtained from:  frank@fwi.uva.nl (Frank van der Linden) via rick@snowhite.cis.uoguelph.ca
1996-07-16 10:19:45 +00:00
Garrett Wollman
2c37256e5a Modify the kernel to use the new pr_usrreqs interface rather than the old
pr_usrreq mechanism which was poorly designed and error-prone.  This
commit renames pr_usrreq to pr_ousrreq so that old code which depended on it
would break in an obvious manner.  This commit also implements the new
interface for TCP, although the old function is left as an example
(#ifdef'ed out).  This commit ALSO fixes a longstanding bug in the
TCP timer processing (introduced by davidg on 1995/04/12) which caused
timer processing on a TCB to always stop after a single timer had
expired (because it misinterpreted the return value from tcp_usrreq()
to indicate that the TCB had been deleted).  Finally, some code
related to polling has been deleted from if.c because it is not
relevant t -current and doesn't look at all like my current code.
1996-07-11 16:32:50 +00:00
Bruce Evans
8cd5acbce0 Don't truncate minor or major numbers in the nfsv3 client. 1996-06-23 17:19:25 +00:00
Poul-Henning Kamp
5b28a6011f Fix for NFS_NOSERVER
Poul mentioned that he thought this was some kind of timing problem, and
that started me thinking. After a little poking around, I found that
nfs_timer() was completely disabled when NFS_NOSERVER was #defined.
But after looking at nfs_timer(), it seemed like it was something
required by both the client and server code, and disabling it outright
just didn't seem to make any sense. Parts of it relate only to the
NFS server side code, so I disabled those, but I re-enabled the rest
of the function and made sure that it would be called from nfs_init()
(in nfs_subs.c).

With nfs_timer() re-enabled, everything seems to work again. The only
other changes I made were to #ifdef away some variable declarations
in the NFS_NOSERVER case so that gcc would stop complaining about
unused variables.

Reviewed by:	phk
Submitted by:	Bill Paul <wpaul@skynet.ctr.columbia.edu>
1996-06-14 11:13:21 +00:00
David Greenman
2f9bae59d6 Moved the fsnode MALLOC to before the call to getnewvnode() so that the
process won't possibly block before filling in the fsnode pointer (v_data)
which might be dereferenced during a sync since the vnode is put on the
mnt_vnodelist by getnewvnode.

Pointed out by Matt Day <mday@artisoft.com>
1996-06-12 03:37:57 +00:00
Paul Traina
75985daf3b Clear flags before using an inactive buffer. This is a kludge, but
matches the code in bread().

Reviewed by:	bde
1996-06-08 05:59:04 +00:00
Poul-Henning Kamp
e911eafcba removed:
CLBYTES PD_SHIFT PGSHIFT NBPG PGOFSET CLSIZELOG2 CLSIZE pdei()
        ptei() kvtopte() ptetov() ispt() ptetoav() &c &c
new:
        NPDEPG

Major macro cleanup.
1996-05-02 14:21:14 +00:00
Bruce Evans
71d96b71c2 #include <sys/filedesc.h> explicitly instead of depending on it being
bogusly included by <sys/socketvar.h>.
1996-04-30 23:26:52 +00:00
Bruce Evans
21e0797227 Fixed nfs sysctls. They missed out on the fs -> vfs name changes from
Lite2.  This broke nfsstat.
1996-04-30 23:23:09 +00:00
Garrett Wollman
dc915e7cfc Kill XNS.
While we're at it, fix socreate() to take a process argument.  (This
was supposed to get committed days ago...)
1996-02-13 18:16:31 +00:00
Mike Pritchard
6c5e9bbdf5 Fix a bunch of spelling errors in the comment fields of
a bunch of system include files.
1996-01-30 23:02:38 +00:00
Bruce Evans
ad59a83d3e Fixed spelling of s_namlen so that this compiles again. 1996-01-25 00:45:37 +00:00
Poul-Henning Kamp
1ce9bf88c3 Use new printf features rather than local kludges. 1996-01-24 21:12:23 +00:00
Mike Pritchard
97f1b9871e Add a check to prevent a computation from underflowing and causing
a panic due to an attaempt to allocate a buffer for a terabyte or
so of data when an attempt is made to create sparse data (e.g.
a holey file) more than 1 block past the end of the file.

Note:  some other areas of this code need to be looked at,
since they might cause problems when the file size exceeds 2GB,
due to storing results in ints when the computations are being
done with quad sized variables.

Reviewed by:	bde
1996-01-24 18:52:18 +00:00
John Dyson
bd7e5f992e Eliminated many redundant vm_map_lookup operations for vm_mmap.
Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish
	overhead for merged cache.
Efficiency improvement for vfs_cluster.  It used to do alot of redundant
	calls to cluster_rbuild.
Correct the ordering for vrele of .text and release of credentials.
Use the selective tlb update for 486/586/P6.
Numerous fixes to the size of objects allocated for files.  Additionally,
	fixes in the various pagers.
Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs.
Fixes in the swap pager for exhausted resources.  The pageout code
	will not as readily thrash.
Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into
	page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE),
	thereby improving efficiency of several routines.
Eliminate even more unnecessary vm_page_protect operations.
Significantly speed up process forks.
Make vm_object_page_clean more efficient, thereby eliminating the pause
	that happens every 30seconds.
Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the
	case of filesystems mounted async.
Fix a panic with busy pages when write clustering is done for non-VMIO
	buffers.
1996-01-19 04:00:31 +00:00
Poul-Henning Kamp
99cb299316 Add an option NFS_NOSERVER which saves 100K in the install kernel (or
any other kernel that uses it).  Use with option NFS.
1996-01-13 23:27:58 +00:00
Poul-Henning Kamp
d2f08956ec Don't print swap server as root server.
Submitted by:	Mattias.Gronlund@sa.erisoft.se (Mattias Gronlund)
1995-12-28 21:56:49 +00:00
Poul-Henning Kamp
0240c4db5b Move fs.nfs.nfsstats sysctl var back to it's old OID. 1995-12-22 15:57:38 +00:00
Poul-Henning Kamp
b8dce649f1 Staticize. 1995-12-17 21:14:36 +00:00
David Greenman
efeaf95a41 Untangled the vm.h include file spaghetti. 1995-12-07 12:48:31 +00:00
Bruce Evans
dee6b0ab68 Completed function declarations and/or added prototypes and/or moved
prototypes to the right place.
1995-12-03 10:03:12 +00:00
Bruce Evans
55054f3540 Completed function declarations, added prototypes and removed redundant
declarations.
1995-11-21 15:51:39 +00:00
Bruce Evans
512fef80a9 Completed function declarations and/or added prototypes. 1995-11-21 12:55:26 +00:00
Poul-Henning Kamp
9565c0e60d Get rid of hostnamelen variable. 1995-11-14 09:37:22 +00:00
Bruce Evans
e4f937b07b Included <sys/sysproto.h> to get central declarations for syscall args
structs and prototypes for syscalls.

Ifdefed duplicated decentralized declarations of args structs.  It's
convenient to have this visible but they are hard to maintain.  Some
are already different from the central declarations.  4.4lite2 puts
them in comments in the function headers but I wanted to avoid the
large changes for that.
1995-11-14 05:16:37 +00:00
Bruce Evans
f57e65478d Introduced a type `vop_t' for vnode operation functions and used
it 1138 times (:-() in casts and a few more times in declarations.
This change is null for the i386.

The type has to be `typedef int vop_t(void *)' and not `typedef
int vop_t()' because `gcc -Wstrict-prototypes' warns about the
latter.  Since vnode op functions are called with args of different
(struct pointer) types, neither of these function types is any use
for type checking of the arg, so it would be preferable not to use
the complete function type, especially since using the complete
type requires adding 1138 casts to avoid compiler warnings and
another 40+ casts to reverse the function pointer conversions before
calling the functions.
1995-11-09 08:17:23 +00:00
Bruce Evans
8b25681eb5 Replaced bogus macros for dummy devswitch entries by functions.
These functions went away:

	enosys (hasn't been used for some time)
	enxio
	enodev
	enoioctl (was used only once, actually for a vop)

if_tun.c:
Continued cleaning up...

conf.h:
Probably fixed the type of d_reset_t.  It is hard to tell the correct
type because there are no non-dummy device reset functions.

Removed last vestige of ambiguous sleep message strings.
1995-11-06 00:36:19 +00:00
Joerg Wunsch
e046098fa9 Include a prerequisite header (so this is consistent again with the
NFSv2 state).
1995-10-31 21:17:59 +00:00
Poul-Henning Kamp
a98ca4699e Second batch of cleanup changes.
This time mostly making a lot of things static and some unused
variables here and there.
1995-10-29 15:33:36 +00:00
David Greenman
5344cc61f5 Fix order problem: unbusy pages before releasing the buffer.
Submitted by:	John Dyson <dyson>
1995-10-22 09:37:45 +00:00
David Greenman
d68a41903e Moved the filesystem read-only check out of the syscalls and into the
filesystem layer, as was done in lite-2. Merged in some other cosmetic
changes while I was at it. Rewrote most of msdosfs_access() to be more
like ufs_access() and to include the FS read-only check.

Obtained from: partially from 4.4BSD-lite2
1995-10-22 09:32:48 +00:00
John Dyson
c83ebe7781 Added VOP_GETPAGES/VOP_PUTPAGES and also the "backwards" block count
for VOP_BMAP.  Updated affected filesystems...
1995-09-04 00:21:16 +00:00
Doug Rabson
fac7bcd92c Make nfs diskless work again.
Reviewed by:	John Hay <jhay@mikom.csir.co.za>
1995-08-30 17:24:15 +00:00
David Greenman
dcc84850a7 Killed redundant declarations of nfsm_rpchead(). 1995-08-24 11:04:04 +00:00
Doug Rabson
c3b2cc769c Some fixes found using gcc -Wall:
nfsm_rpchead() has been called with the wrong number of args and misplaced
args since someone added new args in the middle for nfsv3.

Here's another one that would be important on 64-bit systems.  VOP_READDIR
takes a `u_int **cookies' arg.

Submitted by:	Bruce Evans <bde@zeta.org.au>
1995-08-24 10:45:16 +00:00
Doug Rabson
27df97742b Add support for amd direct maps.
Reviewed by:	Thomas Graichen <graichen@sirius.physik.fu-berlin.de>
1995-08-24 10:17:39 +00:00
David Greenman
628641f8a6 Converted mountlist to a CIRCLEQ.
Partially obtained from: 4.4BSD-Lite2
1995-08-11 11:31:18 +00:00
David Greenman
4777741358 Removed my special-case hack for VOP_LINK and fixed the problem with the
wrong vp's ops vector being used by changing the VOP_LINK's argument order.
The special-case hack doesn't go far enough and breaks the generic
bypass routine used in some non-leaf filesystems. Pointed out by Kirk
McKusick.
1995-08-01 18:51:02 +00:00
Bruce Evans
28f8db1403 Eliminate sloppy common-style declarations. There should be none left for
the LINT configuation.
1995-07-29 11:44:31 +00:00
Doug Rabson
e1b686876b Slightly better fix than previous revision.
Submitted by:	Rick Macklem <rick@snowhite.cis.uoguelph.ca>
1995-07-24 16:38:05 +00:00
Doug Rabson
47212158a0 Fix a problem which appeared to truncate a file to the nearest block boundary
when it is moved to an NFS filesystem from from another filesystem and /bin/mv
failed to set the file ownership during the move.

I believe that this bug is present in STABLE but I have not tested it.  The fix
would be the same in STABLE even though the code has changed quite considerably
in CURRENT.
1995-07-24 12:50:49 +00:00
David Greenman
d15ea423f7 Correct my cut-'n-paste job from ffs_vfsops.c and fix up the formatting
to be similar.

Submitted by:	Bruce Evans
1995-07-22 03:32:18 +00:00
David Greenman
f5526ddb48 Implemented an nfs_node hash list lock, similar to what was implemented
in ffs_vget(), and for the same reason: to prevent a race condition that
results in duplicate vnodes/NFSnodes being allocated.
1995-07-21 10:25:13 +00:00
David Greenman
24aa09cd4f vnode_pager_alloc() never returns NULL, so don't check for it. 1995-07-20 09:43:12 +00:00
Doug Rabson
5d81e553a2 I believe that the following fix to nfs_vnops.c should do the trick w.r.t.
the problem "when a file is truncated on the server after being written on
a client under NFSv3, the client doesn't see the size drop to zero".
(As you noted, the problem is that NMODIFIED wasn't being cleared by nfs_close
 when it flushed the buffers. After checking through the code, the only place
 where NMODIFIED was used to test for the possibility of dirty blocks was in
 nfs_setattr(). The two cases are safe to do when there aren't dirty blocks,
 so I just took out the tests. Unfortunately, testing for
 v_dirtyblkhd.lh_first being non-null is not sufficient, since there are
 times when the code moves blocks to the clean list and then back to the
 dirty list.)

Submitted by:	rick@snowhite.cis.uoguelph.ca
1995-07-13 17:55:12 +00:00
David Greenman
24a1cce34f NOTE: libkvm, w, ps, 'top', and any other utility which depends on struct
proc or any VM system structure will have to be rebuilt!!!

Much needed overhaul of the VM system. Included in this first round of
changes:

1) Improved pager interfaces: init, alloc, dealloc, getpages, putpages,
   haspage, and sync operations are supported. The haspage interface now
   provides information about clusterability. All pager routines now take
   struct vm_object's instead of "pagers".

2) Improved data structures. In the previous paradigm, there is constant
   confusion caused by pagers being both a data structure ("allocate a
   pager") and a collection of routines. The idea of a pager structure has
   escentially been eliminated. Objects now have types, and this type is
   used to index the appropriate pager. In most cases, items in the pager
   structure were duplicated in the object data structure and thus were
   unnecessary. In the few cases that remained, a un_pager structure union
   was created in the object to contain these items.

3) Because of the cleanup of #1 & #2, a lot of unnecessary layering can now
   be removed. For instance, vm_object_enter(), vm_object_lookup(),
   vm_object_remove(), and the associated object hash list were some of the
   things that were removed.

4) simple_lock's removed. Discussion with several people reveals that the
   SMP locking primitives used in the VM system aren't likely the mechanism
   that we'll be adopting. Even if it were, the locking that was in the code
   was very inadequate and would have to be mostly re-done anyway. The
   locking in a uni-processor kernel was a no-op but went a long way toward
   making the code difficult to read and debug.

5) Places that attempted to kludge-up the fact that we don't have kernel
   thread support have been fixed to reflect the reality that we are really
   dealing with processes, not threads. The VM system didn't have complete
   thread support, so the comments and mis-named routines were just wrong.
   We now use tsleep and wakeup directly in the lock routines, for instance.

6) Where appropriate, the pagers have been improved, especially in the
   pager_alloc routines. Most of the pager_allocs have been rewritten and
   are now faster and easier to maintain.

7) The pagedaemon pageout clustering algorithm has been rewritten and
   now tries harder to output an even number of pages before and after
   the requested page. This is sort of the reverse of the ideal pagein
   algorithm and should provide better overall performance.

8) Unnecessary (incorrect) casts to caddr_t in calls to tsleep & wakeup
   have been removed. Some other unnecessary casts have also been removed.

9) Some almost useless debugging code removed.

10) Terminology of shadow objects vs. backing objects straightened out.
    The fact that the vm_object data structure escentially had this
    backwards really confused things. The use of "shadow" and "backing
    object" throughout the code is now internally consistent and correct
    in the Mach terminology.

11) Several minor bug fixes, including one in the vm daemon that caused
    0 RSS objects to not get purged as intended.

12) A "default pager" has now been created which cleans up the transition
    of objects to the "swap" type. The previous checks throughout the code
    for swp->pg_data != NULL were really ugly. This change also provides
    the rudiments for future backing of "anonymous" memory by something
    other than the swap pager (via the vnode pager, for example), and it
    allows the decision about which of these pagers to use to be made
    dynamically (although will need some additional decision code to do
    this, of course).

13) (dyson) MAP_COPY has been deprecated and the corresponding "copy
    object" code has been removed. MAP_COPY was undocumented and non-
    standard. It was furthermore broken in several ways which caused its
    behavior to degrade to MAP_PRIVATE. Binaries that use MAP_COPY will
    continue to work correctly, but via the slightly different semantics
    of MAP_PRIVATE.

14) (dyson) Sharing maps have been removed. It's marginal usefulness in a
    threads design can be worked around in other ways. Both #12 and #13
    were done to simplify the code and improve readability and maintain-
    ability. (As were most all of these changes)

TODO:

1) Rewrite most of the vnode pager to use VOP_GETPAGES/PUTPAGES. Doing
   this will reduce the vnode pager to a mere fraction of its current size.

2) Rewrite vm_fault and the swap/vnode pagers to use the clustering
   information provided by the new haspage pager interface. This will
   substantially reduce the overhead by eliminating a large number of
   VOP_BMAP() calls. The VOP_BMAP() filesystem interface should be
   improved to provide both a "behind" and "ahead" indication of
   contiguousness.

3) Implement the extended features of pager_haspage in swap_pager_haspage().
   It currently just says 0 pages ahead/behind.

4) Re-implement the swap device (swstrategy) in a more elegant way, perhaps
   via a much more general mechanism that could also be used for disk
   striping of regular filesystems.

5) Do something to improve the architecture of vm_object_collapse(). The
   fact that it makes calls into the swap pager and knows too much about
   how the swap pager operates really bothers me. It also doesn't allow
   for collapsing of non-swap pager objects ("unnamed" objects backed by
   other pagers).
1995-07-13 08:48:48 +00:00
David Greenman
06cb725951 Moved call to VOP_GETATTR() out of vnode_pager_alloc() and into the places
that call vnode_pager_alloc() so that a failure return can be dealt with.
This fixes a panic seen on NFS clients when a file being opened is deleted
on the server before the open completes.
1995-07-09 06:58:03 +00:00
Doug Rabson
05085e65f6 Use a consistent blocksize for sizing bufs to avoid panicing the bio system. 1995-07-07 11:01:31 +00:00
Doug Rabson
b7ae4efa24 Use the correct cred for nfs_commit operations. 1995-06-28 17:33:39 +00:00
David Greenman
aa2cabb958 1) Converted v_vmdata to v_object.
2) Removed unnecessary vm_object_lookup()/pager_cache(object, TRUE) pairs
   after vnode_pager_alloc() calls - the object is already guaranteed to be
   persistent.
3) Removed some gratuitous casts.
1995-06-28 12:01:13 +00:00
David Greenman
9879652657 Fixed VOP_LINK argument order botch. 1995-06-28 07:06:55 +00:00
Doug Rabson
a62dc40654 Changes to support version 3 of the NFS protocol.
The version 2 support has been tested (client+server) against FreeBSD-2.0,
IRIX 5.3 and FreeBSD-current (using a loopback mount).  The version 2 support
is stable AFAIK.
The version 3 support has been tested with a loopback mount and minimally
against an IRIX 5.3 server.  It needs more testing and may have problems.
I have patched amd to support the new variable length filehandles although
it will still only use version 2 of the protocol.

Before booting a kernel with these changes, nfs clients will need to at least
build and install /usr/sbin/mount_nfs.  Servers will need to build and
install /usr/sbin/mountd.

NFS diskless support is untested.

Obtained from: Rick Macklem <rick@snowhite.cis.uoguelph.ca>
1995-06-27 11:07:30 +00:00
Joerg Wunsch
ec05e1f5d7 The duplicate information returned in fa_type and fa_mode
is an ambiguity in the NFS version 2 protocol.

VREG should be taken literally as a regular file.  If a
server intents to return some type information differently
in the upper bits of the mode field (e.g. for sockets, or
FIFOs), NFSv2 mandates fa_type to be VNON.  Anyway, we
leave the examination of the mode bits even in the VREG
case to avoid breakage for bogus servers, but we make sure
that there are actually type bits set in the upper part of
fa_mode (and failing that, trust the va_type field).

NFSv3 cleared the issue, and requires fa_mode to not
contain any type information (while also introduing sockets
and FIFOs for fa_type).

The fix has been tested against a variety of NFS servers.
It fixes problems with the ``Tropic'' NFS server for Windows,
while apparently not breaking anything.

Pointed-out by: scott@zorch.sf-bay.org (Scott Hazen Mueller)
1995-06-14 06:23:38 +00:00
Rodney W. Grimes
d3628763db Merge RELENG_2_0_5 into HEAD 1995-06-11 19:33:05 +00:00
Rodney W. Grimes
9b2e535452 Remove trailing whitespace. 1995-05-30 08:16:23 +00:00
David Greenman
77f53bcf27 Fixed some serious bugs that resulted in object reference counts not being
handled correctly. This would manifest itself as "object deallocated too
many times" panics and perhaps other strange inconsistencies on NFS servers.

Reviewed by:	me, of course
Submitted by:	John Dyson
1995-05-29 04:01:09 +00:00
David Greenman
61f5d51062 Changes to fix the following bugs:
1) Files weren't properly synced on filesystems other than UFS. In some
   cases, this lead to lost data. Most likely would be noticed on NFS.
   The fix is to make the VM page sync/object_clean general rather than
   in each filesystem.
2) Mixing regular and mmaped file I/O on NFS was very broken. It caused
   chunks of files to end up as zeroes rather than the intended contents.
   The fix was to fix several race conditions and to kludge up the
   "b_dirtyoff" and "b_dirtyend" that NFS relies upon - paying attention
   to page modifications that occurred via the mmapping.

Reviewed by:	David Greenman
Submitted by:	John Dyson
1995-05-21 21:39:31 +00:00
David Greenman
a401ebbe32 Changed swap partition handling/allocation so that it doesn't
require specific partitions be mentioned in the kernel config
file ("swap on foo" is now obsolete).

From Poul-Henning:

The visible effect is this:

As default, unless
        options "NSWAPDEV=23"
is in your config, you will have four swap-devices.
You can swapon(2) any block device you feel like, it doesn't have
to be in the kernel config.

There is a performance/resource win available by getting the NSWAPDEV right
(but only if you have just one swap-device ??), but using that as default
would be too restrictive.

The invisible effect is that:

Swap-handling disappears from the $arch part of the kernel.
It gets a lot simpler (-145 lines) and cleaner.

Reviewed by:	John Dyson, David Greenman
Submitted by:	Poul-Henning Kamp, with minor changes by me.
1995-05-14 03:00:10 +00:00
John Dyson
e0a4d029a5 Slight re-ordering of the creation of a vmio object to fix a condition
that can cause NFS I/O failures.
1995-04-21 02:58:49 +00:00
David Greenman
8cbf8eafa6 Various fixes from John Dyson:
1) Rewrote screwy code that uses an incore buffer without making it busy.
2) Use B_CACHE instead of B_DONE in cases where it is appropriate.
3) Minor code optimization.

This *might* fix kern/345 submitted by Heikki Suonsivu.
1995-04-16 05:05:25 +00:00
David Greenman
64709e7406 Deleted bogus DIAGNOSTIC "nfs_fsync: dirty" message. This can and does
happen normally when there is heavy write activity to a file since the
vnode isn't locked (NFS plays fast and loose with vnode locks). This change
"fixes" PR#267.
1995-03-23 09:43:40 +00:00
Garrett Wollman
bbf3a56610 Add four more filesystem flags:
VFCF_NETWORK (this FS goes over the net)
	VFCF_READONLY (read-write mounts do not make any sense)
	VFCF_SYNTHETIC (data in this FS is not real)
	VFCF_LOOPBACK (this FS aliases something else)

cd9660 is readonly; nullfs, umapfs, and union are loopback; NFS is netowkr;
procfs, kernfs, and fdesc are synthetic.
1995-03-16 20:23:48 +00:00
Bruce Evans
b5e8ce9f12 Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'.  Fix all the bugs found.  There were no serious
ones.
1995-03-16 18:17:34 +00:00
David Greenman
403ef252fa Removed obsolete vtrace() remnants. 1995-03-04 03:24:45 +00:00
Poul-Henning Kamp
78ff637a2c YF fix. 1995-02-15 04:21:32 +00:00
David Greenman
efefea024a Fixed two more bugs related to the merged cache changes.
Submitted by:	John Dyson
1995-02-15 03:40:00 +00:00
Poul-Henning Kamp
473e9734a8 YFfix
+int    nfsrv_vput __P(( struct vnode * ));
+int    nfsrv_vrele __P(( struct vnode * ));
+int    nfsrv_vmio __P(( struct vnode * ));
1995-02-14 06:22:18 +00:00
David Greenman
081129c5e3 Changed order of release of vnode/object to fix a problem where the vnode
is freed with an old object still attached (subsequently causing a panic).
Fixes NFS server panic "object/pager mismatch".

Submitted by:	John Dyson
1995-02-06 02:20:40 +00:00
David Greenman
efc68ce10f Fixed bmap run-length brokeness.
Use bmap run-length extension when doing clustered paging.

Submitted by:	John Dyson
1995-02-03 06:46:28 +00:00
David Greenman
ef762baa88 Removed a pile of vfs_unbusy_pages()...both unnecessary and wrong - resulted
in serious system instability. Changed a B_INVAL to a B_NOCACHE so that
buffer data is properly disposed of.

Submitted by:	John Dyson, Rick Macklin, and ohki@gssm.otsuka.tsukuba.ac.jp
1995-02-03 03:40:08 +00:00
David Greenman
740c65cfec Added two missing brelse() calls.
Submitted by:	rick@snowhite.cis.uoguelph.ca
1995-01-10 13:06:51 +00:00
David Greenman
0d94caffca These changes embody the support of the fully coherent merged VM buffer cache,
much higher filesystem I/O performance, and much better paging performance. It
represents the culmination of over 6 months of R&D.

The majority of the merged VM/cache work is by John Dyson.

The following highlights the most significant changes. Additionally, there are
(mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to
support the new VM/buffer scheme.

vfs_bio.c:
Significant rewrite of most of vfs_bio to support the merged VM buffer cache
scheme.  The scheme is almost fully compatible with the old filesystem
interface.  Significant improvement in the number of opportunities for write
clustering.

vfs_cluster.c, vfs_subr.c
Upgrade and performance enhancements in vfs layer code to support merged
VM/buffer cache.  Fixup of vfs_cluster to eliminate the bogus pagemove stuff.

vm_object.c:
Yet more improvements in the collapse code.  Elimination of some windows that
can cause list corruption.

vm_pageout.c:
Fixed it, it really works better now.  Somehow in 2.0, some "enhancements"
broke the code.  This code has been reworked from the ground-up.

vm_fault.c, vm_page.c, pmap.c, vm_object.c
Support for small-block filesystems with merged VM/buffer cache scheme.

pmap.c vm_map.c
Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of
kernel PTs.

vm_glue.c
Much simpler and more effective swapping code.  No more gratuitous swapping.

proc.h
Fixed the problem that the p_lock flag was not being cleared on a fork.

swap_pager.c, vnode_pager.c
Removal of old vfs_bio cruft to support the past pseudo-coherency.  Now the
code doesn't need it anymore.

machdep.c
Changes to better support the parameter values for the merged VM/buffer cache
scheme.

machdep.c, kern_exec.c, vm_glue.c
Implemented a seperate submap for temporary exec string space and another one
to contain process upages. This eliminates all map fragmentation problems
that previously existed.

ffs_inode.c, ufs_inode.c, ufs_readwrite.c
Changes for merged VM/buffer cache.  Add "bypass" support for sneaking in on
busy buffers.

Submitted by:	John Dyson and David Greenman
1995-01-09 16:06:02 +00:00
Jordan K. Hubbard
c4cab716b9 From Gene Stark:
If nd->swap_nblks is zero in nfs_mountroot(), then the system
       comes up without initializing swapdev_vp to an actual vnode pointer.
       The swap pager assumes a non-NULL value for swapdev_vp.

       The fix is to try initializing local swap if no NFS swap space
       is specified.
1995-01-03 07:44:10 +00:00
Poul-Henning Kamp
d8290e4901 Would you please correct nfs/nfs_vfsops.c so that the ip address of the
root filesystem is printed out correctly?
It's line 299 in nfs/nfs_vfsops.c.

Reviewed by:	phk
Submitted by:	Luigi Rizzo luigi@iet.unipi.it
1994-12-08 20:59:33 +00:00
Garrett Wollman
d9e91095ab Forward-declare a few structures to avoid warning messages. 1994-11-02 00:11:00 +00:00
Garrett Wollman
b43e29afed Implement fs.nfs MIB variables. 1994-10-23 23:26:18 +00:00
Poul-Henning Kamp
4a14171589 This is where the action is. I'm still not sure that swap is 100% OK, but
it seems to work.
1994-10-22 17:52:59 +00:00
Poul-Henning Kamp
6ae324074a This is a bunch of changes from NetBSD. There are a couple of bug-fixes.
But mostly it is changes to use the list-maintenance macros instead of
doing the pointer-gymnastics by hand.

Obtained from: NetBSD
1994-10-17 17:47:45 +00:00
David Greenman
35c10d2239 Got rid of map.h. It's a leftover from the rmap code, and we use rlists.
Changed swapmap into swaplist.
1994-10-09 07:35:18 +00:00
David Greenman
824789192c Use tsleep() rather than sleep so that 'ps' is more informative about
the wait.
1994-10-06 21:07:04 +00:00
Poul-Henning Kamp
48fbb6cc7e Prototyping and general gcc-shutting up. Gcc has one warning now which looks
bad, I will get to it eventually, unless somebody beats me to it.
1994-10-02 17:27:07 +00:00
Garrett Wollman
e21fa31a8e Make NFS loadable. 1994-09-22 22:10:49 +00:00
Garrett Wollman
c9b1d6048d More loadable VFS changes:
- Make a number of filesystems work again when they are statically compiled
  (blush)

- FIFOs are no longer optional; ``options FIFO'' removed from distributed
  config files.
1994-09-22 19:38:41 +00:00
Garrett Wollman
c901836c14 Implemented loadable VFS modules, and made most existing filesystems
loadable.  (NFS is a notable exception.)
1994-09-21 03:47:43 +00:00
David Greenman
1cdeb653a8 "bogus" fixes from 1.1.5 to work around some cache coherency problems. 1994-08-29 06:09:15 +00:00
Paul Richards
33420ec6e7 More idempotency....... this is fun :-) 1994-08-21 06:50:16 +00:00
David Greenman
e0e9c42112 Implemented filesystem clean bit via:
machdep.c:
	Changed printf's a little and call vfs_unmountall() if the sync was
	successful.

cd9660_vfsops.c, ffs_vfsops.c, nfs_vfsops.c, lfs_vfsops.c:
	Allow dismount of root FS. It is now disallowed at a higher level.

vfs_conf.c:
	Removed unused rootfs global.

vfs_subr.c:
	Added new routines vfs_unmountall and vfs_unmountroot. Filesystems
	are now dismounted if the machine is properly rebooted.

ffs_vfsops.c:
	Toggle clean bit at the appropriate places. Print warning if an
	unclean FS is mounted.

ffs_vfsops.c, lfs_vfsops.c:
	Fix bug in selecting proper flags for VOP_CLOSE().

vfs_syscalls.c:
	Disallow dismounting root FS via umount syscall.
1994-08-20 16:03:26 +00:00
Garrett Wollman
f23b4c91c4 Fix up some sloppy coding practices:
- Delete redundant declarations.
- Add -Wredundant-declarations to Makefile.i386 so they don't come back.
- Delete sloppy COMMON-style declarations of uninitialized data in
  header files.
- Add a few prototypes.
- Clean up warnings resulting from the above.

NB: ioconf.c will still generate a redundant-declaration warning, which
is unavoidable unless somebody volunteers to make `config' smarter.
1994-08-18 22:36:09 +00:00
David Greenman
4c5483f462 Initialize lockf pointer. I missed this when I made NFS use the generic
advlock mechanism, and not doing so results in random system crashes.
1994-08-10 19:48:23 +00:00
David Greenman
4b43e1d8ca Removed some padding bytes from the nfsnode struct to make the structure
size a power of 2 again. The system complains otherwise - probably because
it wastes space with our malloc scheme otherwise.
1994-08-09 15:10:14 +00:00
David Greenman
92dc7331c9 Made lockf advisory locking code generic (rather than ufs specific), and
use it in NFS. This is required both for diskless support and for POSIX
compliance. Note: the support in NFS is only for the local node.

Submitted by:	based on work originally done by Yuval Yurom
1994-08-08 17:31:01 +00:00
David Greenman
866dba7305 Changed B_AGE policy to work correctly in a world with relatively large
buffer caches. The old policy generally ended up caching nothing.
1994-08-08 09:11:44 +00:00
David Greenman
a03460f16e Converted 'vmunix' to 'kernel'. 1994-08-05 09:28:55 +00:00
David Greenman
b1b7658158 Made NFS attribute cache timeouts kernel config file tunable via
NFS_MINATTRTIMO and NFS_MAXATTRTIMO.
1994-08-04 06:03:46 +00:00
David Greenman
3c4dd3568f Added $Id$ 1994-08-02 07:55:43 +00:00
Rodney W. Grimes
26f9a76710 The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by:	Rodney W. Grimes
Submitted by:	John Dyson and David Greenman
1994-05-25 09:21:21 +00:00
Rodney W. Grimes
df8bae1de4 BSD 4.4 Lite Kernel Sources 1994-05-24 10:09:53 +00:00