Commit Graph

663 Commits

Author SHA1 Message Date
Poul-Henning Kamp
b792bebeea Move the buffer method vector (buf->b_op) to the bufobj.
Extend it with a strategy method.

Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY
song and dance.

Rename ibwrite to bufwrite().

Move the two NFS buf_ops to more sensible places, add bufstrategy
to them.

Add inlines for bwrite() and bstrategy() which calls through
buf->b_bufobj->b_ops->b_{write,strategy}().

Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().
2004-10-24 20:03:41 +00:00
Poul-Henning Kamp
494eb176e7 Add b_bufobj to struct buf which eventually will eliminate the need for b_vp.
Initialize b_bufobj for all buffers.

Make incore() and gbincore() take a bufobj instead of a vnode.

Make inmem() local to vfs_bio.c

Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj)
also VI_MTX() to BO_MTX(),

Make buf_vlist_add() take a bufobj instead of a vnode.

Eliminate other uses of bp->b_vp where bp->b_bufobj will do.

Various minor polishing: remove "register", turn panic into KASSERT,
use new function declarations, TAILQ_FOREACH_SAFE() etc.
2004-10-22 08:47:20 +00:00
Poul-Henning Kamp
a76d8f4ec9 Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAIT
Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write
count on a bufobj.  Bufobj_wdrop() replaces vwakeup().

Use these functions all relevant places except in ffs_softdep.c where
the use if interlocked_sleep() makes this impossible.

Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.
2004-10-21 15:53:54 +00:00
Pawel Jakub Dawidek
1a32dca7a3 Add a missing newline character. 2004-10-14 19:00:44 +00:00
David Schultz
506d3e1bcc nfsclient/nfs_bio.c has a PHOLD() without a PRELE(). Neither should
be necessary here.  Also, use killproc() instead of psignal().
2004-10-01 05:01:41 +00:00
Poul-Henning Kamp
c0f46dd1e4 Remove support for using NFS device nodes. 2004-09-28 08:50:01 +00:00
Poul-Henning Kamp
52c55a26b1 Remove NFS4 vop method vector for devices: we are desupporing device nodes
on anything but DEVFS and in this case it was not even used (see below).

Put the NFS4 vop method for fifo's behind "#if 0" because it is unused.
Add a XXX comment to say that I think the unusedness is a bug.
2004-09-27 20:02:50 +00:00
Poul-Henning Kamp
9f2b7bc4a8 style consistency. 2004-09-27 19:44:39 +00:00
Poul-Henning Kamp
08dbd671ff Remove unused B_WRITEINPROG flag 2004-09-15 21:49:22 +00:00
Poul-Henning Kamp
35f134080f Explicitly pass vnode to nfs_doio() and mountpoint to nfs_asyncio(). 2004-09-07 08:56:43 +00:00
Robert Watson
640c9dcf69 In nfs_timer(), pass curthread rather than &thread0 into the protocol
send routine.  In IPv6 UDP, the thread will be passed to suser(), which
asserts that if a thread is used for a super user check, it be
curthread.  Many of these protocol entry points probably need to
accept credentials instead of threads.

MT5 candidate.

Noticed/tested by:	kuriyama
2004-08-25 01:23:38 +00:00
Poul-Henning Kamp
5e8c582ac2 Put a version element in the VFS filesystem configuration structure
and refuse initializing filesystems with a wrong version.  This will
aid maintenance activites on the 5-stable branch.

s/vfs_mount/vfs_omount/

s/vfs_nmount/vfs_mount/

Name our filesystems mount function consistently.

Eliminate the namiedata argument to both vfs_mount and vfs_omount.
It was originally there to save stack space.  A few places abused
it to get hold of some credentials to pass around.  Effectively
it is unused.

Reorganize the root filesystem selection code.
2004-07-30 22:08:52 +00:00
Poul-Henning Kamp
0658bb8ef8 Move a relic to its correct location(s): Put nfs diskless initialization
calls with the code they call.  (Yet another example of mindless copy&paste).
2004-07-28 21:54:57 +00:00
Poul-Henning Kamp
d634f69316 Remove global variable rootdevs and rootvp, they are unused as such.
Add local rootvp variables as needed.

Remove checks for miniroot's in the swappartition.  We never did that
and most of the filesystems could never be used for that, but it had
still been copy&pasted all over the place.
2004-07-28 20:21:04 +00:00
Poul-Henning Kamp
cf95b5c381 Eliminate unused second argument to reassignbuf() and simplify it
accordingly.
2004-07-25 21:24:23 +00:00
Alfred Perlstein
8f0a7125a1 Turn off SO_REUSEADDR and SO_REUSEPORT, they were causing EADDRINUSE
to be returned from the protocol stack.

Pointy hat to me for not groking what those options _really_ mean.
2004-07-13 05:42:59 +00:00
David Malone
dcee93dcf9 Rename Alfred's kern_setsockopt to so_setsockopt, as this seems a
a better name. I have a kern_[sg]etsockopt which I plan to commit
shortly, but the arguments to these function will be quite different
from so_setsockopt.

Approved by:	alfred
2004-07-12 21:42:33 +00:00
Alfred Perlstein
f257b7a54b Make VFS_ROOT() and vflush() take a thread argument.
This is to allow filesystems to decide based on the passed thread
which vnode to return.
Several filesystems used curthread, they now use the passed thread.
2004-07-12 08:14:09 +00:00
Alfred Perlstein
d58d3648dd Use SO_REUSEADDR and SO_REUSEPORT when reconnecting NFS mounts.
Tune the timeout from 5 seconds to 12 seconds.
Provide a sysctl to show how many reconnects the NFS client has done.

Seems to fix IPv6 from: kuriyama
2004-07-12 06:22:42 +00:00
Brian Somers
0ac4013324 Change the following environment variables to kernel options:
bootp -> BOOTP
    bootp.nfsroot -> BOOTP_NFSROOT
    bootp.nfsv3 -> BOOTP_NFSV3
    bootp.compat -> BOOTP_COMPAT
    bootp.wired_to -> BOOTP_WIRED_TO

- i.e. back out the previous commit.  It's already possible to
pxeboot(8) with a GENERIC kernel.

Pointed out by: dwmalone
2004-07-08 22:35:36 +00:00
Brian Somers
59e1ebc9b5 Change the following kernel options to environment variables:
BOOTP -> bootp
    BOOTP_NFSROOT -> bootp.nfsroot
    BOOTP_NFSV3 -> bootp.nfsv3
    BOOTP_COMPAT -> bootp.compat
    BOOTP_WIRED_TO -> bootp.wired_to

This lets you PXE boot with a GENERIC kernel by putting this sort of thing
in loader.conf:

    bootp="YES"
    bootp.nfsroot="YES"
    bootp.nfsv3="YES"
    bootp.wired_to="bge1"

or even setting the variables manually from the OK prompt.
2004-07-08 13:40:33 +00:00
Robert Watson
cf813ab244 Acquire socket lock in nfs_connect() connection/sleep loop to protect
socket state and avoid missed wakeups.
2004-07-06 16:55:41 +00:00
Alfred Perlstein
14af415c34 use vfs_suser() to restrict access to the nfs mount's timeout. 2004-07-06 09:40:44 +00:00
Alfred Perlstein
67ac03405a NFS mobility Phase VI:
Export NFS mount state via sysctl.
Export timeout via sysctl.
2004-07-06 09:23:17 +00:00
Alfred Perlstein
c713aaaeca NFS mobility PHASE I, II & III (phase VI, and V pending):
Rebind the client socket when we experience a timeout.  This fixes
the case where our IP changes for some reason.

Signal a VFS event when NFS transitions from up to down and vice
versa.

Add a placeholder vfs_sysctl where we will put status reporting
shortly.

Also:
Make down NFS mounts return EIO instead of EINTR when there is a
soft timeout or force unmount in progress.
2004-07-06 09:12:03 +00:00
Poul-Henning Kamp
e3c5a7a4dd When we traverse the vnodes on a mountpoint we need to look out for
our cached 'next vnode' being removed from this mountpoint.  If we
find that it was recycled, we restart our traversal from the start
of the list.

Code to do that is in all local disk filesystems (and a few other
places) and looks roughly like this:

		MNT_ILOCK(mp);
	loop:
		for (vp = TAILQ_FIRST(&mp...);
		    (vp = nvp) != NULL;
		    nvp = TAILQ_NEXT(vp,...)) {
			if (vp->v_mount != mp)
				goto loop;
			MNT_IUNLOCK(mp);
			...
			MNT_ILOCK(mp);
		}
		MNT_IUNLOCK(mp);

The code which takes vnodes off a mountpoint looks like this:

	MNT_ILOCK(vp->v_mount);
	...
	TAILQ_REMOVE(&vp->v_mount->mnt_nvnodelist, vp, v_nmntvnodes);
	...
	MNT_IUNLOCK(vp->v_mount);
	...
	vp->v_mount = something;

(Take a moment and try to spot the locking error before you read on.)

On a SMP system, one CPU could have removed nvp from our mountlist
but not yet gotten to assign a new value to vp->v_mount while another
CPU simultaneously get to the top of the traversal loop where it
finds that (vp->v_mount != mp) is not true despite the fact that
the vnode has indeed been removed from our mountpoint.

Fix:

Introduce the macro MNT_VNODE_FOREACH() to traverse the list of
vnodes on a mountpoint while taking into account that vnodes may
be removed from the list as we go.  This saves approx 65 lines of
duplicated code.

Split the insmntque() which potentially moves a vnode from one mount
point to another into delmntque() and insmntque() which does just
what the names say.

Fix delmntque() to set vp->v_mount to NULL while holding the
mountpoint lock.
2004-07-04 08:52:35 +00:00
Robert Watson
7322ba7d8b When updating sb_flags, acquire the socket buffer lock to prevent
races.
2004-06-24 03:12:13 +00:00
Poul-Henning Kamp
f3732fd15b Second half of the dev_t cleanup.
The big lines are:
	NODEV -> NULL
	NOUDEV -> NODEV
	udev_t -> dev_t
	udev2dev() -> findcdev()

Various minor adjustments including handling of userland access to kernel
space struct cdev etc.
2004-06-17 17:16:53 +00:00
Robert Watson
8950572050 Remove bad cookie vp kernel printf; while it does notify about an
interesting event, there's little or nothing the user can do about
it.
2004-06-17 00:15:37 +00:00
Robert Watson
9ef2900f9d Convert GIANT_REQUIRED to NET_ASSERT_GIANT where Giant is used to
protect socket operations.  Leave one "as-is" as it also frobs
rootvp.
2004-06-16 03:12:50 +00:00
Alan Cox
5a32489377 Make vm_page's PG_ZERO flag immutable between the time of the page's
allocation and deallocation.  This flag's principal use is shortly after
allocation.  For such cases, clearing the flag is pointless.  The only
unusual use of PG_ZERO is in vfs_bio_clrbuf().  However, allocbuf() never
requests a prezeroed page.  So, vfs_bio_clrbuf() never sees a prezeroed
page.

Reviewed by:	tegge@
2004-05-06 05:03:23 +00:00
Peter Edwards
7c8ca9400e Let the NFS client notice a file's size changing as a modification.
This avoids presenting invalid data to the client's applications
when the file is modified, and then extended within the window of
the resolution of the modifcation timestamp.

Reviewed By:	iedowse
PR:		kern/64091
2004-04-14 23:23:55 +00:00
Marcel Moolenaar
5fa064a8d8 Unbreak build: s/TAILQ_ISEMPTY/TAILQ_EMPTY/g 2004-04-11 17:15:36 +00:00
Peter Edwards
1630ff08a3 Clean up properly when unloading NFS client module.
This includes a modified form of some code from Thomas Moestl (tmm@)
to properly clean up the UMA zone and the "nfsnodehashtbl" hash
table.

Reviewed By:	iedowse
PR:		16299
2004-04-11 13:30:20 +00:00
Warner Losh
2fcbca0d85 Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 05:00:01 +00:00
Robert Watson
ecd189d420 Spell 2 as SHUT_RDWR when used as an argument to soshutdown(). 2004-04-04 19:24:08 +00:00
Peter Edwards
5182703cc8 Flush cached access mode after modifying a files attributes for
NFSv3. It's likely that modifying the attributes will affect the
file's accessibility. This version of the patch is one suggested
by Ian Dowse after reviewing my original attempt in the PR

Reviewed By: iedowse
PR: kern/44336
MFC after: 3 days
2004-04-03 17:23:46 +00:00
Alexander Kabaev
186c0bc04b Reset callout if in nfs_timeout and rpcclnt_timeout functions. Timer
are supposed to continue firing as long as there is work to do, not
stop after the first invocation.

This is damage control after a patch that has been committed prematurely.

Tested by:	kris
2004-03-28 05:55:27 +00:00
Jim Rees
f9955a5f53 only do nfs rpc callouts if there is work to do.
Submitted by:	kan
Approved by:	alfred
2004-03-25 21:48:09 +00:00
Pawel Jakub Dawidek
512af1646c Add a comment with an explanation why we don't report EPIPE errors on
nfs sockets.

Requested by:	ru
2004-03-17 21:10:20 +00:00
Pawel Jakub Dawidek
41eda4e2b1 Don't report EPIPE errors on nfs sockets. These can be due to idle tcp
mounts which will be closed by netapp, solaris, etc. if left idle too long.

Obtained from:	NetBSD
2004-03-17 18:10:38 +00:00
Peter Wemm
6df0617286 Calculate NFS timeouts in units of 10ms, not 5ms. This matches the default
clock precision on i386.  This is a NOP change on i386.  But this stops
the mount_nfs units from suddenly changing to units of 1/20 of a second
(vs the normal 1/10 of a second) if HZ is increased.
2004-03-14 06:21:56 +00:00
Brooks Davis
41b7cd3729 Allow kernel with the BOOTP option to boot when DHCP/BOOTP sets the root
path to an absolute path without a host name.  Previously, there was a
nasty POLA violation where a system would PXE boot until you added the
BOOTP option and then it would panic instead.

Reviewed by:	tegge, Dirk-Willem van Gulik <dirkx at webweaving.org>
		(a previous version)
Submitted by:	tegge (getip function)
2004-03-12 20:37:40 +00:00
Poul-Henning Kamp
4d453ef101 Properly vector all bwrite() and BUF_WRITE() calls through the same path
and s/BUF_WRITE()/bwrite()/ since it now does the same as bwrite().
2004-03-11 18:02:36 +00:00
Poul-Henning Kamp
651b11eaf2 Remove unused second arg to vfinddev().
Don't call addaliasu() on VBLK nodes.
2004-03-11 16:33:11 +00:00
Robert Watson
746e5bf09b Rename dup_sockaddr() to sodupsockaddr() for consistency with other
functions in kern_socket.c.

Rename the "canwait" field to "mflags" and pass M_WAITOK and M_NOWAIT
in from the caller context rather than "1" or "0".

Correct mflags pass into mac_init_socket() from previous commit to not
include M_ZERO.

Submitted by:	sam
2004-03-01 03:14:23 +00:00
Jim Rees
73c02c410e NFSv4 fixes from Connectathon 2004:
remove unused pid field of file context struct
map nfs4 error codes to errnos
eliminate redundant code from nfs4_request
use zero stateid on setattr that doesn't set file size
use same clientid on all mounts until reboot
invalidate dirty bufs in nfs4_close, to play it safe
open file for writing if truncating and it's not already open

Approved by:	alfred
2004-02-27 19:37:43 +00:00
Colin Percival
11233aabf8 If mountnfs returns an error, it will have already freed nam; no need to
free it again.

Reported by:	"Ted Unangst" <tedu@coverity.com>
Approved by:	rwatson (mentor)
2004-02-22 01:17:47 +00:00
John Baldwin
91d5354a2c Locking for the per-process resource limits structure.
- struct plimit includes a mutex to protect a reference count.  The plimit
  structure is treated similarly to struct ucred in that is is always copy
  on write, so having a reference to a structure is sufficient to read from
  it without needing a further lock.
- The proc lock protects the p_limit pointer and must be held while reading
  limits from a process to keep the limit structure from changing out from
  under you while reading from it.
- Various global limits that are ints are not protected by a lock since
  int writes are atomic on all the archs we support and thus a lock
  wouldn't buy us anything.
- All accesses to individual resource limits from a process are abstracted
  behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return
  either an rlimit, or the current or max individual limit of the specified
  resource from a process.
- dosetrlimit() was renamed to kern_setrlimit() to match existing style of
  other similar syscall helper functions.
- The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit()
  (it didn't used the stackgap when it should have) but uses lim_rlimit()
  and kern_setrlimit() instead.
- The svr4 compat no longer uses the stackgap for resource limits calls,
  but uses lim_rlimit() and kern_setrlimit() instead.
- The ibcs2 compat no longer uses the stackgap for resource limits.  It
  also no longer uses the stackgap for accessing sysctl's for the
  ibcs2_sysconf() syscall but uses kernel_sysctl() instead.  As a result,
  ibcs2_sysconf() no longer needs Giant.
- The p_rlimit macro no longer exists.

Submitted by:	mtm (mostly, I only did a few cleanups and catchups)
Tested on:	i386
Compiled on:	alpha, amd64
2004-02-04 21:52:57 +00:00
David E. O'Brien
f191a0bcf6 Bump the NFCv3/TCP defaults for rsize and wsize from 8K to 32K to match
Solaris and HP-UX.  This increases read performance for large files across NFS.

PR:		62024 & 26324
Submitted by:	Bjoern Groenvall <bg@sics.se>
2004-01-31 10:40:15 +00:00