Commit Graph

140 Commits

Author SHA1 Message Date
Chuck Lever
7d8a7e19c7 rick says:
The following bug was just identified in OpenBSD and it looks like the same
bug exists in the other BSDen NFS servers.

A Linux client (don't know which version, but you can look at
	http://bugzilla.kernel.org/show_bug.cgi?id=6256)
does a Setattr of mtime to the server's time, where the file is mode 0664 and
the client user has group access (ie. caller is not the file owner).

The BSD servers fail the Setattr with EPERM, since the VA_UTIMES_NULL flag
isn't set before doing the VOP_SETATTR.

It seems to me that this should be allowed, since it is allowed for a local
utimes(2). If so, the fix is to set VA_UTIMES_NULL for the
"set-time-to-server-time" cases of setting atime and/or mtime.

Submitted by:	rick@snowhite.cis.uoguelph.ca
Reviewed by:	cel
Approved by:	silby
MFC after:	1 week
2006-04-02 04:24:57 +00:00
Jeff Roberson
3bbd6d8ae6 - Release the references acquired by VOP_GETWRITEMOUNT and vfs_getvfs().
Discussed with:	tegge
Tested by:	kris
Sponsored by:	Isilon Systems, Inc.
2006-03-31 03:54:20 +00:00
Jeff Roberson
89b0e10910 - Reorder calls to vrele() after calls to vput() when the vrele is a
directory.  vrele() may lock the passed vnode, which in these cases would
   give an invalid lock order of child -> parent.  These situations are
   deadlock prone although do not typically deadlock because the vrele
   is typically not releasing the last reference to the vnode.  Users of
   vrele must consider it as a call to vn_lock() and order it appropriately.

MFC After: 	1 week
Sponsored by:	Isilon Systems, Inc.
Tested by:	kkenn
2006-02-01 00:25:26 +00:00
John Baldwin
7e9e371f2d Use the refcount API to manage the reference count for user credentials
rather than using pool mutexes.

Tested on:	i386, alpha, sparc64
2005-09-27 18:09:42 +00:00
Sam Leffler
bd1da15f2a avoid potential null ptr deref by free'ing excess mbufs instead of
zero'ing their length (copied from m_adj where this code came from
after the equivalent change there has had time to soak)

Noticed by:	Coverity Prevent analysis tool
2005-03-28 18:51:58 +00:00
Poul-Henning Kamp
c62801a7f8 Don't try to create vnode_pager objects on other filesystems vnodes,
either they did it themselves or it won't happen.
2005-01-24 22:09:13 +00:00
Paul Saab
f1b3bfb348 Now that we have a non blocking version of nfsm_dissect(), change all the
nfsm_dissect() calls (done under the NFSD lock) to nfsm_dissect_nonblock().

Submitted by:	Mohan Srinivasan
2005-01-19 22:53:40 +00:00
Poul-Henning Kamp
e39db32ab0 Ditch vfs_object_create() and make the callers call VOP_CREATEVOBJECT()
directly.
2005-01-13 12:25:19 +00:00
Warner Losh
c398230b64 /* -> /*- for license, minor formatting changes 2005-01-07 01:45:51 +00:00
Robert Watson
9e0219d901 If debug.mpsafenet is non-zero, run the NFS server callout without
Giant.
2004-07-24 02:32:27 +00:00
Poul-Henning Kamp
3e019deaed Do a pass over all modules in the kernel and make them return EOPNOTSUPP
for unknown events.

A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".
2004-07-15 08:26:07 +00:00
Bosko Milekic
d1fd2228b8 Giant wasn't dropped here if we have to return EBUSY. This is bad. 2004-05-31 20:21:06 +00:00
Robert Watson
30bef9add8 The NFS server modevent code manually patches the system call table to
install nfssvc().  It also updates the argument count, but did so
without setting SYF_MPSAFE, effectively removing the MPSAFE flag even
when syscalls.master indicates it doesn't require Giant.  This change
forces the modevent to set MPSAFE as a flag to its internal notion of
an argument coutn.

Note: this duplication of information is a bad thing, but is a more
general problem I'm not currently willing to address.
2004-05-31 00:59:10 +00:00
Robert Watson
1ee624b31d The socket code upcalls into the NFS server using the so_upcall
mechanism so that early processing on mbufs can be performed before
a context switch to the NFS server threads.  Because of this, if
the socket code is running without Giant, the NFS server also needs
to be able to run the upcall code without relying on the presence on
Giant.  This change modifies the NFS server to run using a "giant
code lock" covering operation of the whole subsystem.  Work is in
progress to move to data-based locking as part of the NFSv4 server
changes.

Introduce an NFS server subsystem lock, 'nfsd_mtx', and a set of
macros to operate on the lock:

  NFSD_LOCK_ASSERT()    Assert nfsd_mtx owned by current thread
  NFSD_UNLOCK_ASSERT()  Assert nfsd_mtx not owned by current thread
  NFSD_LOCK_DONTCARE()  Advisory: this function doesn't care
  NFSD_LOCK()           Lock nfsd_mtx
  NFSD_UNLOCK()         Unlock nfsd_mtx

Constify a number of global variables/structures in the NFS server
code, as they are not modified and contain constants only:

  nfsrvv2_procid       nfsrv_nfsv3_procid      nonidempotent
  nfsv2_repstat        nfsv2_type              nfsrv_nfsv3_procid
  nfsrvv2_procid       nfsrv_v2errmap          nfsv3err_null
  nfsv3err_getattr     nfsv3err_setattr        nfsv3err_lookup
  nfsv3err_access      nfsv3err_readlink       nfsv3err_read
  nfsv3err_write       nfsv3err_create         nfsv3err_mkdir
  nfsv3err_symlink     nfsv3err_mknod          nfsv3err_remove
  nfsv3err_rmdir       nfsv3err_rename         nfsv3err_link
  nfsv3err_readdir     nfsv3err_readdirplus    nfsv3err_fsstat
  nfsv3err_fsinfo      nfsv3err_pathconf       nfsv3err_commit
  nfsrv_v3errmap

There are additional structures that should be constified but due
to their being passed into general purpose functions without const
arguments, I have not yet converted.

In general, acquire nfsd_mtx when accessing any of the global NFS
structures, including struct nfssvc_sock, struct nfsd, struct
nfsrv_descript.

Release nfsd_mtx whenever calling into VFS, and acquire Giant for
calls into VFS.  Giant is not required for any part of the
operation of the NFS server with the exception of calls into VFS.
Giant will never by acquired in the upcall code path.  However, it
may operate entirely covered by Giant, or not.  If debug.mpsafenet
is set to 0, the system calls will acquire Giant across all
operations, and the upcall will assert Giant.  As such, by default,
this enables locking and allows us to test assertions, but should not
cause any substantial new amount of code to be run without Giant.
Bugs should manifest in the form of lock assertion failures for now.

This approach is similar (but not identical) to modifications to the
BSD/OS NFS server code snapshot provided by BSDi as part of their
SMPng snapshot.  The strategy is almost the same (single lock over
the NFS server), but differs in the following ways:

- Our NFS client and server code bases don't overlap, which means
  both fewer bugs and easier locking (thanks Peter!).  Also means
  NFSD_*() as opposed to NFS_*().

- We make broad use of assertions, whereas the BSD/OS code does not.

- Made slightly different choices about how to handle macros building
  packets but operating with side effects.

- We acquire Giant only when entering VFS from the NFS server daemon
  threads.

- Serious bugs in BSD/OS implementation corrected -- the snapshot we
  received was clearly a work in progress.

Based on ideas from:	BSDi SMPng Snapshot
Reviewed by:		rick@snowhite.cis.uoguelph.ca
Extensive testing by:	kris
2004-05-24 04:06:14 +00:00
Peter Edwards
ae00154c4b Don't let the NFS server module be unloaded as long as there are
nfsd processes running

Reviewed By:	iedowse
PR:		16299
2004-04-11 13:33:34 +00:00
Warner Losh
2fcbca0d85 Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 05:00:01 +00:00
Alexander Kabaev
fa3d2a12df Convert from timeout to callout API.
Submitted by: rwatson
2004-03-07 16:28:31 +00:00
John Baldwin
a5b061f9d2 Fix some becuase -> because typos.
Reported by:	Marco Wertejuk <wertejuk@mwcis.com>
2003-12-17 16:12:01 +00:00
David E. O'Brien
ab0de15baf Use __FBSDID(). 2003-06-11 05:37:42 +00:00
Don Lewis
263c8abeb9 Beat vnode locking in the NFS server code into submission. This change
is not pretty, but it fixes the code so that it no longer violates the
vnode locking rules in the VFS API and doesn't trip any of the locking
assertions enabled by the DEBUG_VFS_LOCKS kernel configuration option.
There is one report that this patch fixed a "locking against myself"
panic on an NFS server that was tripped by a diskless client.

Approved by:	re (scottl)
2003-05-25 06:17:33 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Jeff Roberson
24b50116ed - Introduce a new macro, since that's what nfs loves, called
nfsm_srvpathsiz.  This macro plucks a length out of an rpc request and
   verifies that its size does not exceed NFS_MAXPATHLEN.  If it does
   it generates an ENAMETOOLONG response.
 - Use this macro, and the existing nfsm_srvnamsiz macro in two places
   where we deal with paths passed in by the client.

This fixes a linux interoperability bug.  Linux was sending oversized path
components which would cause us to ignore the request all together.  This
causes linux to hang indefinitly while it waits for a response.  This
could still happen in other cases where we error out with EBADRPC.

Sponsored by:	Isilon Systems, Inc.
Reviewed by:	alfred, fabbri@isilon.com, neal@isilon.com
2002-10-31 22:35:03 +00:00
Robert Watson
94998f80fe Set the NOMACCHECK flag for namei()'s generated by the NFS server code.
We currently don't enforce protections on NFS-originated VOP's.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-19 21:27:40 +00:00
Poul-Henning Kamp
f5cd3d67fe Make the V2 errno translation more resistent to new errnos. 2002-08-21 19:28:44 +00:00
Alfred Perlstein
09ce4f7aaf Add IPv6 support.
Submitted by: Jean-Luc Richier <Jean-Luc.Richier@imag.fr>
2002-07-15 19:40:23 +00:00
Matthew Dillon
3d8f797ac1 Convert old style (type foo *)0 casts to NULLs
PR:		kern/40360
Requested by:	Hiten PAndya via direct email
2002-07-11 17:54:58 +00:00
Jeff Roberson
ab426dc822 Remove references to vm_zone.h and switch over to the new uma API. 2002-03-20 10:07:52 +00:00
John Baldwin
a854ed9893 Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.
2002-02-27 18:32:23 +00:00
Mike Smith
b3a39c8ae2 Rename some variables that end up shadowing their namesakes in the NFS client
code.

Reviewed by:	peter
2002-01-08 19:41:06 +00:00
Ian Dowse
9669bb479a Avoid passing the variable `tl' to functions that just use it for
temporary storage. In the old NFS code it wasn't at all clear if
the value of `tl' was used across or after macro calls, but I'm
fairly confident that the convention was to keep its use local.
Each ex-macro function now uses a local version of this variable,
so all of the double-indirection goes away.

The only exception to the `local use' rule for `tl' is nfsm_clget(),
which is left unchanged by this commit.

Reviewed by:	peter
2001-12-18 01:22:09 +00:00
Peter Wemm
b9b0e19206 Unwind some more macros. NFSMADV() was kinda silly since it was right
next to equivalent m_len adjustments.  Move the nfsm_subs.h macros
into groups depending on which phase they are used in, since that
affects the error recovery requirements.  Collect some of the common error
checking into a single macro as preparation for unwinding some more.
Have nfs_rephead return a value instead of secretly modifying args.
Remove some unused function arguments that were being passed around.
Clarify nfsm_reply()'s error handling (I hope).
2001-09-28 04:37:08 +00:00
Peter Wemm
1290984b33 Make nfsm_dissect() have an obvious return value. 2001-09-27 22:40:38 +00:00
Peter Wemm
ea7fe289fe Tidy up nfsm_build usage. This is only partially finished. 2001-09-27 02:33:36 +00:00
Peter Wemm
987f5fca86 Wrap a module around the init code so that we have somethign do do a
modfind(2) on, and declare a version so that loader/kldload etc
can locate it (using kldxref's linker.hints file if needed).
2001-09-20 05:13:43 +00:00
Peter Wemm
eb25edbda3 Cleanup and split of nfs client and server code.
This builds on the top of several repo-copies.
2001-09-18 23:32:09 +00:00
Julian Elischer
b40ce4165d KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
Matthew Dillon
0cddd8f023 With Alfred's permission, remove vm_mtx in favor of a fine-grained approach
(this commit is just the first stage).  Also add various GIANT_ macros to
formalize the removal of Giant, making it easy to test in a more piecemeal
fashion. These macros will allow us to test fine-grained locks to a degree
before removing Giant, and also after, and to remove Giant in a piecemeal
fashion via sysctl's on those subsystems which the authors believe can
operate without Giant.
2001-07-04 16:20:28 +00:00
John Baldwin
bc2327c310 - Protect the mnt_vnode list with the mntvnode lock.
- Use queue(9) macros.
2001-06-28 04:10:07 +00:00
Alfred Perlstein
2395531439 Introduce a global lock for the vm subsystem (vm_mtx).
vm_mtx does not recurse and is required for most low level
vm operations.

faults can not be taken without holding Giant.

Memory subsystems can now call the base page allocators safely.

Almost all atomic ops were removed as they are covered under the
vm mutex.

Alpha and ia64 now need to catch up to i386's trap handlers.

FFS and NFS have been tested, other filesystems will need minor
changes (grabbing the vm lock when twiddling page properties).

Reviewed (partially) by: jake, jhb
2001-05-19 01:28:09 +00:00
Greg Lehey
60fb0ce365 Revert consequences of changes to mount.h, part 2.
Requested by:	bde
2001-04-29 02:45:39 +00:00
Greg Lehey
d98dc34f52 Correct #includes to work with fixed sys/mount.h. 2001-04-23 09:05:15 +00:00
Bosko Milekic
2a0c503e7a * Rename M_WAIT mbuf subsystem flag to M_TRYWAIT.
This is because calls with M_WAIT (now M_TRYWAIT) may not wait
  forever when nothing is available for allocation, and may end up
  returning NULL. Hopefully we now communicate more of the right thing
  to developers and make it very clear that it's necessary to check whether
  calls with M_(TRY)WAIT also resulted in a failed allocation.
  M_TRYWAIT basically means "try harder, block if necessary, but don't
  necessarily wait forever." The time spent blocking is tunable with
  the kern.ipc.mbuf_wait sysctl.
  M_WAIT is now deprecated but still defined for the next little while.

* Fix a typo in a comment in mbuf.h

* Fix some code that was actually passing the mbuf subsystem's M_WAIT to
  malloc(). Made it pass M_WAITOK instead. If we were ever to redefine the
  value of the M_WAIT flag, this could have became a big problem.
2000-12-21 21:44:31 +00:00
Kirk McKusick
d6514f21d7 In preparation for deprecating CIRCLEQ macros in favor of TAILQ
macros which provide the same functionality and are a bit more
efficient, convert use of CIRCLEQ's in NFS to TAILQ's.
2000-11-14 08:00:39 +00:00
David Malone
dc6dd1259f Problem to avoid processes getting stuck in "vmopar". From Ian's
mail:

	The problem seems to originate with NFS's postop_attr
	information that is returned with a read or write RPC.
	Within a vm_fault context, the code cannot deal with
	vnode_pager_setsize() shrinking a vnode.

	The workaround in the patch below stops the nfsm_postop_attr()
	macro from ever shrinking a vnode. If the new size in the
	postop_attr information is smaller, then it just sets the
	nfsnode n_attrstamp to 0 to stop the wrong size getting
	used in the future. This change only affects postop_attr
	attributes; the nfsm_loadattr() macro works as normal.

	The change is implemented by adding a new argument to
	nfs_loadattrcache() called 'dontshrink'. When this is
	non-zero, nfs_loadattrcache() will never reduce the
	vnode/nfsnode size; instead it zeros n_attrstamp.

There remain other was processes can get stuck in vmopar.

Submitted by:	Ian Dowse <iedowse@maths.tcd.ie>
Reviewed by:	dillon
Tested by:	Vadim Belman <voland@lflat.org>
2000-10-24 10:13:36 +00:00
Kirk McKusick
9b97113391 This patch corrects the first round of panics and hangs reported
with the new snapshot code.

Update addaliasu to correctly implement the semantics of the old
checkalias function. When a device vnode first comes into existence,
check to see if an anonymous vnode for the same device was created
at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than
creating a new vnode for the device. This corrects a problem which
caused the kernel to panic when taking a snapshot of the root
filesystem.

Change the calling convention of vn_write_suspend_wait() to be the
same as vn_start_write().

Split out softdep_flushworklist() from softdep_flushfiles() so that
it can be used to clear the work queue when suspending filesystem
operations.

Access to buffers becomes recursive so that snapshots can recursively
traverse their indirect blocks using ffs_copyonwrite() when checking
for the need for copy on write when flushing one of their own indirect
blocks. This eliminates a deadlock between the syncer daemon and a
process taking a snapshot.

Ensure that softdep_process_worklist() can never block because of a
snapshot being taken. This eliminates a problem with buffer starvation.

Cleanup change in ffs_sync() which did not synchronously wait when
MNT_WAIT was specified. The result was an unclean filesystem panic
when doing forcible unmount with heavy filesystem I/O in progress.

Return a zero'ed block when reading a block that was not in use at
the time that a snapshot was taken. Normally, these blocks should
never be read. However, the readahead code will occationally read
them which can cause unexpected behavior.

Clean up the debugging code that ensures that no blocks be written
on a filesystem while it is suspended. Snapshots must explicitly
label the blocks that they are writing during the suspension so that
they do not cause a `write on suspended filesystem' panic.

Reorganize ffs_copyonwrite() to eliminate a deadlock and also to
prevent a race condition that would permit the same block to be
copied twice. This change eliminates an unexpected soft updates
inconsistency in fsck caused by the double allocation.

Use bqrelse rather than brelse for buffers that will be needed
soon again by the snapshot code. This improves snapshot performance.
2000-07-24 05:28:33 +00:00
Jake Burkholder
e39756439c Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by:		msmith and others
2000-05-26 02:09:24 +00:00
Jake Burkholder
740a1973a6 Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by:	phk
Reviewed by:	phk
Approved by:	mdodd
2000-05-23 20:41:01 +00:00
Poul-Henning Kamp
9626b608de Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by:    peter
2000-05-05 09:59:14 +00:00
Poul-Henning Kamp
3389ae9350 Remove ~25 unneeded #include <sys/conf.h>
Remove ~60 unneeded #include <sys/malloc.h>
2000-04-19 14:58:28 +00:00