Commit Graph

151 Commits

Author SHA1 Message Date
wpaul
ce6595c493 Fix a condition where nfs_statfs() can precipitate a panic. There is
code that says this:

        nfsm_request(vp, NFSPROC_FSSTAT, p, cred);
        if (v3)
                nfsm_postop_attr(vp, retattr);
        if (!error)
                nfsm_dissect(sfp, struct nfs_statfs *, NFSX_STATFS(v3));

The problem here is that if error != 0, nfsm_dissect() will not be
called, which leaves sfp == NULL. But nfs_statfs() does not bail out
at this point: it continues processing until it tries to dereference
sfp, which causes a panic. I was able to generate this crash under
the following conditions:

1) Set up a machine as an NFS server and NFS client, with amd running
   (using NIS maps). /usr/local is exported, though any exported fs
   can can be used to trigger the bug.
2) Log in as normal user, with home directory mounted from a SunOS 4.1.3
   NFS server via amd (along with a few other NFS filesystems from same
   machine).
3) Su to root and type the following:
   # mount localhost:/usr/local /mnt
   # df

To fix the panic, I changed the code to read:

        if (!error) {
                nfsm_dissect(sfp, struct nfs_statfs *, NFSX_STATFS(v3));
        } else
                goto nfsmout;

This is a bit kludgy in that nfsmout is a label defined by the nfsm_subs.h
macros, but these macros are themselves more than a little kludgy. This
stops the machine from crashing, but does not fix the overall bug: 'error'
somehow becomes 5 (EIO) when a statfs() is performed on the locally mounted
NFS filesystem. This seems to only happen the first time the filesystem
is accesed: on subsequent accesses, it seems to work fine again.

Now, I know there's no practical use in mounting a local filesystem
via NFS, but doing it shouldn't cause the system to melt down.
1997-06-27 19:10:46 +00:00
tegge
99c93b125b Clear nfs_iodwant[myiod] when the nfsiod process exits due to a signal. 1997-06-25 21:07:26 +00:00
dfr
d0e3689dc2 Avoid small synchronous writes when an application does lots of random-access
short writes within a block (e.g. ld).
1997-06-25 08:35:41 +00:00
dfr
044a01ff54 Make nfs_lookup return a NULLVP on error so that DIAGNOSTIC kernels don't
panic.
1997-06-25 08:32:33 +00:00
dyson
1f8ae0bc17 Upgrade NFS to support the new vfs_bio resource/buffer management. 1997-06-16 00:23:40 +00:00
tegge
00328205b8 Move commonly used code into static functions in order to reduce kernel bloat. 1997-06-12 14:08:20 +00:00
tegge
293ce38d5e Remove unused routines. 1997-06-12 14:03:16 +00:00
dfr
79a7930874 Fix a problem caused by removing large numbers of files from a directory
which could cause a bad size to be given to uiomove, causing a page fault.
1997-06-06 08:12:17 +00:00
dfr
76b427792d Various fixes from NetBSD:
Use u_int for rpc procedure numbers.
	Some fixes to NQNFS.
	A rare NULL pointer dereference.
	Ignore NFSMNT_NOCONN for TCP mounts.

Obtained from:	NetBSD
1997-06-03 17:22:47 +00:00
dfr
ac25b2ae50 Implement the async mount option for NFSv3. This makes NFS pretend that all
writes sent to the server were synchronous and therefore no commits are
needed.  This is the same as the vfs.nfs.async variable on the server but
allows each client to choose whether to work this way.

Also make the vfs.nfs.async variable do the 'right' thing for NFSv3, i.e.
pretend that the write was synchronous.
1997-06-03 13:56:55 +00:00
dfr
3b9f4c26be Fix a problem with nfs_flush where if many B_NEEDCOMMIT buffers are
attached to the vnode, some of them could be re-written synchronously
(if they overflowed the fixed size array nfs_flush had for them).  The
fix involves mallocing an array if there are more than its limited
size stack buffer.

Reviewed by:	Hidetoshi Shimokawa <simokawa@sat.t.u-tokyo.ac.jp>
1997-06-03 10:03:40 +00:00
dfr
8bcc683195 Fix some performance problems with the NFS mmap fixes. 1997-06-03 09:42:43 +00:00
dfr
5e9d2676f4 Plug a memory leak in nfs_link.
PR:		kern/1001
1997-05-20 08:06:31 +00:00
dfr
b07b491ee6 Fix a few bugs with NFS and mmap caused by NFS' use of b_validoff
and b_validend.  The changes to vfs_bio.c are a bit ugly but hopefully
can be tidied up later by a slight redesign.

PR:		kern/2573, kern/2754, kern/3046 (possibly)
Reviewed by:	dyson
1997-05-19 14:36:56 +00:00
phk
fb0e4c639b Remove redundant check for vp == dvp (done in VFS before calling). 1997-05-17 18:32:53 +00:00
tegge
d1cca0706f Use same syntax as netboot for root and swap mounts.
Handle mount options.
Ignore T16 (swap server address) and T6 (DNS server).
1997-05-14 01:36:51 +00:00
dfr
5c19f50214 Check the B_CLUSTER flag when choosing whether to use unstable or filesync
writes.

PR:		kern/3438
Submitted by:	Tor Egge <Tor.Egge@idi.ntnu.no>
1997-05-13 19:41:32 +00:00
dfr
559f65c7db Don't keep addresses in mbuf chains. This should simplify the next round
of network changes from Garret.

Reviewed by:	Garrett Wollman <wollman@khavrinen.lcs.mit.edu>
1997-05-13 17:25:44 +00:00
tegge
a78905acf9 Use the old nfs arguments in the nfs_diskless structure, to be
compatible with boot proms made from the 2.2 source.
Convert the nfs arguments when copying to the new diskless structure.
Copy the gateway field in the diskless structure.
1997-05-12 19:02:56 +00:00
tegge
450e76cf21 Bring in some kernel bootp support. This removes the need for netboot
to fill in the nfs_diskless structure, at the cost of some kernel
bloat. The advantage is that this code works on a wider range of
network adapters than netboot. Several new kernel options are
documented in LINT.
Obtained from: parts of the code comes from NetBSD.
1997-05-11 18:05:39 +00:00
dfr
726da88c32 Implement a separate control for write gathering on NFSv3. This is turned
off for NFSv3 by default since write gathering seems to reduce performance
for NFSv3 by up to 60%.

Add sysctl knobs to control both variables.
1997-05-10 16:59:36 +00:00
dfr
05a1cc315d Fix a nasty hang connected with write gathering. Also add debug print
statements to bits of the server which helped me find the hang.
1997-05-10 16:12:03 +00:00
dfr
edb588764c Prevent a mapped root which appears on the server as e.g. nobody from
accessing files which it shouldn't be able to.  This required a better
approximation of VOP_ACCESS for NFSv2 (NFSv3 already has an ACCESS rpc
which is a better solution) and adding a call to VOP_ACCESS from VOP_LOOKUP.

PR:		kern/876, kern/2635
Submitted by:	David Malone <dwmalone@maths.tcd.ie> (for kern/2635)
1997-05-09 13:18:42 +00:00
dfr
e30733b8ae Fix memory leak caused by the fact that the directory offset cookies and
the sillyrename information are stored in the same place.
1997-05-09 13:04:43 +00:00
phk
f06fb013e8 Now I can even execute "df" on my diskless :-) 1997-05-04 15:04:49 +00:00
phk
5962b6cdb1 1. Add a {pointer, v_id} pair to the vnode to store the reference to the
".." vnode.  This is cheaper storagewise than keeping it in the
    namecache, and it makes more sense since it's a 1:1 mapping.

2.  Also handle the case of "." more intelligently rather than stuff
    the namecache with pointless entries.

3.  Add two lists to the vnode and hang namecache entries which go from
    or to this vnode.  When cleaning a vnode, delete all namecache
    entries it invalidates.

4.  Never reuse namecache enties, malloc new ones when we need it, free
    old ones when they die.  No longer a hard limit on how many we can
    have.

5.  Remove the upper limit on namelength of namecache entries.

6.  Make a global list for negative namecache entries, limit their number
    to a sysctl'able (debug.ncnegfactor) fraction of the total namecache.
    Currently the default fraction is 1/16th.  (Suggestions for better
    default wanted!)

7.  Assign v_id correctly in the face of 32bit rollover.

8.  Remove the LRU list for namecache entries, not needed.  Remove the
    #ifdef NCH_STATISTICS stuff, it's not needed either.

9.  Use the vnode freelist as a true LRU list, also for namecache accesses.

10. Reuse vnodes more aggresively but also more selectively, if we can't
    reuse, malloc a new one.  There is no longer a hard limit on their
    number, they grow to the point where we don't reuse potentially
    usable vnodes.  A vnode will not get recycled if still has pages in
    core or if it is the source of namecache entries (Yes, this does
    indeed work :-)  "." and ".." are not namecache entries any longer...)

11. Do not overload the v_id field in namecache entries with whiteout
    information, use a char sized flags field instead, so we can get
    rid of the vpid and v_id fields from the namecache struct.  Since
    we're linked to the vnodes and purged when they're cleaned, we don't
    have to check the v_id any more.

12. NFS knew about the limitation on name length in the namecache, it
    shouldn't and doesn't now.

Bugs:
        The namecache statistics no longer includes the hits for ".."
        and "." hits.

Performance impact:
        Generally in the +/- 0.5% for "normal" workstations, but
        I hope this will allow the system to be selftuning over a
        bigger range of "special" applications.  The case where
        RAM is available but unused for cache because we don't have
        any vnodes should be gone.

Future work:
        Straighten out the namecache statistics.

        "desiredvnodes" is still used to (bogusly ?) size hash
        tables in the filesystems.

        I have still to find a way to safely free unused vnodes
        back so their number can shrink when not needed.

        There is a few uses of the v_id field left in the filesystems,
        scheduled for demolition at a later time.

        Maybe a one slot cache for unused namecache entries should
        be implemented to decrease the malloc/free frequency.
1997-05-04 09:17:38 +00:00
phk
14f0973281 Make nfs roots (diskless) functional again. It may still not be correct,
but it is functional.
1997-05-03 13:42:50 +00:00
dfr
d9577b81bb Allow NULL rpcs on non-privileged ports at all times to work around broken
clients.

PR:		kern/3298
Submitted by:	Tor Egge <Tor.Egge@idi.ntnu.no>
1997-04-30 09:51:37 +00:00
wollman
ce1a2d2262 The long-awaited mega-massive-network-code- cleanup. Part I.
This commit includes the following changes:
1) Old-style (pr_usrreq()) protocols are no longer supported, the compatibility
glue for them is deleted, and the kernel will panic on boot if any are compiled
in.

2) Certain protocol entry points are modified to take a process structure,
so they they can easily tell whether or not it is possible to sleep, and
also to access credentials.

3) SS_PRIV is no more, and with it goes the SO_PRIVSTATE setsockopt()
call.  Protocols should use the process pointer they are now passed.

4) The PF_LOCAL and PF_ROUTE families have been updated to use the new
style, as has the `raw' skeleton family.

5) PF_LOCAL sockets now obey the process's umask when creating a socket
in the filesystem.

As a result, LINT is now broken.  I'm hoping that some enterprising hacker
with a bit more time will either make the broken bits work (should be
easy for netipx) or dike them out.
1997-04-27 20:01:29 +00:00
dfr
f609a44d5c Fix broken usage of nm_readdirsize and increase the socket buffers for UDP
to prevent possible socket overflows.

2.2 candidate.

PR:		kern/3304
Reviewed by:	Thomas David Rivers <ponds!rivers@dg-rtp.dg.com>
1997-04-22 17:38:01 +00:00
dfr
2fbca0d3bc Fix a bug where a program which appended many small records to a file could
wind up writing zeros instead of real data when the file is on an NFSv2
mounted directory.

While tracking this bug down, I noticed that nfs_asyncio was waking *all*
the iods when a block was written instead of just one per block.  Fixing this
gives a 25% performance improvment for writes on v2 (less for v3).

Both are 2.2 candidates.

PR:		kern/2774
1997-04-19 14:28:36 +00:00
dfr
b0fcba6492 Don't allow partial buffers to be cluster-comitted.
Zero the b_dirty{off,end} after cluster-comitting a group of buffers.

With these fixes, I was able to complete a 'make world' with remote src
and obj directories.
1997-04-18 14:12:17 +00:00
dfr
d928486c2b Fix various bugs in the locking protocol, allowing proper shared locks
to be used.  This should fix the lock panics that people are seeing.
1997-04-04 17:49:35 +00:00
dfr
c0990dbb1c The code which recovered from a modified directory situation did not check
for eof when re-caching the directory.  This could cause it to loop forever
if a directory was truncated.
1997-04-03 07:52:00 +00:00
bde
fc0d51eb5c Removed #include of <ufs/ufs/dir.h>. Nfs no longer depends on any ufs
features, and the one thing that it depended on (DIRBLKSIZ) now has
conflicting spelling.
1997-03-29 12:40:20 +00:00
bde
a71902b731 Define our own version of DIRBLKSIZ instead of (ab)using ufs's value.
Use the same value of 512 (ufs actually uses DEV_BSIZE).  There are
too many versions of DIRBLKSIZ, one for ufs, one for ext2fs, one for
nfs, one for ibcs2, one for linux, one for applications, ... I think
nfs's DIRBLKSIZ needs to be a divisor of the directory blocks sizes
of all supported file systems.  There is also NFS_DIRBLKSIZ, which is
different from nfs's DIRBLKSIZ but is sometimes confused with it in
comments.

Removed a bogus #ifdef KERNEL that hid the tunable constants for nfs.
This came in undocumented with the Lite2 merge although it isn't in
Lite2.  It required more-bogus #define KERNEL's in fstat and pstat
to make the constants visible.

Restored a spelling fix from rev.1.17.

Removed duplicate #defines of all the the NFS mount option flags.
1997-03-29 12:34:33 +00:00
guido
1ab53abb24 Add code that will reject nfs requests in teh kernel from nonprivileged
ports. This option will be automatically set/cleraed when mount is run
without/with the -n option.
Reviewed by:	Doug Rabson
1997-03-27 20:01:07 +00:00
bde
c9a2e7db06 Don't include <sys/ioctl.h> in the kernel. Stage 2: include
<sys/sockio.h> instead of <sys/ioctl.h> in network files.
1997-03-24 11:33:46 +00:00
bde
fe907b0a86 Fixed some invalid (non-atomic) accesses to `time', mostly ones of the
form `tv = time'.  Use a new function gettime().  The current version
just forces atomicicity without fixing precision or efficiency bugs.
Simplified some related valid accesses by using the central function.
1997-03-22 06:53:45 +00:00
bde
183db9bd07 YAMInTheWrongDirectionF22 (part of rev.1.28.2.3: set B_CLUSTEROK for
commits).
1997-03-09 10:21:26 +00:00
bde
12f3863ec8 Fixed a panic in nfs_writevp(). Lite2 provided a fix for a silly
missing-parentheses bug, but this exposed a misplaced vfs_busy_pages().
This bug cost a factor of 2.5-3 in nfsv3 write performance!  It should
be fixed in 2.2.

Removed some debugging code that gets triggered often in normal
operation.  There are still many backwards diagnostics (#define
DIAGNOSTIC gives no diagnostics).

Submitted by:	vfs_busy_pages() fix by dfr
1997-02-28 17:56:27 +00:00
peter
c8dcd04895 Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.
1997-02-22 09:48:43 +00:00
bde
f72f8dee52 Changed #ifdef COMPAT_PRELITE2' to #ifndef NO_COMPAT_PRELITE2' so that
old nfs mount calls are supported by default.
1997-02-18 04:40:38 +00:00
dyson
8210b8b788 This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
		Mount_std mounts will not work until the getfsent
		library routine is changed.

Reviewed by:	various people
Submitted by:	Jeffery Hsu <hsu@freebsd.org>
1997-02-10 02:22:35 +00:00
jkh
9c0cd3f9df Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
wpaul
0018338d03 Fix (properly, I hope) 'panic: sillyrename dir' crash that can happen
if you do:

% cd /nfsdir
% mkdir -p foo/foo
% mv foo/foo .

nfs_sillyrename() self-destructs if you try to sillyrename a directory,
however nfs_rename() can be coerced into doing just that by the above
sequence of commands. To avoid this, nfs_rename() now checks that
v_type of the 'destination' vnode != VDIR before attempting the
sillyrename. The server correctly handles this particular situation
by returning ENOTEMPTY on the rename() attempt.

I asked if this was the correct fix for this on -hackers but nobody
ever answered.

This is a 2.2 candidate.
1996-12-31 07:10:19 +00:00
wollman
dcbe1b50a7 Convert the interface address and IP interface address structures
to TAILQs.  Fix places which referenced these for no good reason
that I can see (the references remain, but were fixed to compile
again; they are still questionable).
1996-12-13 21:29:07 +00:00
dfr
7abcc0e65a Improve the queuing algorithms used by NFS' asynchronous i/o. The
existing mechanism uses a global queue for some buffers and the
vp->b_dirtyblkhd queue for others.  This turns sequential writes into
randomly ordered writes to the server, affecting both read and write
performance.  The existing mechanism also copes badly with hung
servers, tending to block accesses to other servers when all the iods
are waiting for a hung server.

The new mechanism uses a queue for each mount point.  All asynchronous
i/o goes through this queue which preserves the ordering of requests.
A simple mechanism ensures that the iods are shared out fairly between
active mount points.  This removes the sysctl variable vfs.nfs.dwrite
since the new queueing mechanism removes the old delayed write code
completely.

This should go into the 2.2 branch.
1996-11-06 10:53:16 +00:00
dfr
f91041c4a7 If a large (>4096 bytes) directory was modified, the old directory
contents are discarded, including the cached seek cookies.
Unfortunately, if the directory was larger than NFS_DIRBLKSIZ, then
this confused nfs_readdirrpc(), making it appear as if the directory
was truncated.

Reviewed by:	Karl Denninger <karl@Mcs.Net>
1996-10-21 10:07:52 +00:00
phk
ac176b00fe Add four sysctl variables that joerg wanted. 1996-10-20 15:01:58 +00:00