Commit Graph

329 Commits

Author SHA1 Message Date
Robert Watson
1ee624b31d The socket code upcalls into the NFS server using the so_upcall
mechanism so that early processing on mbufs can be performed before
a context switch to the NFS server threads.  Because of this, if
the socket code is running without Giant, the NFS server also needs
to be able to run the upcall code without relying on the presence on
Giant.  This change modifies the NFS server to run using a "giant
code lock" covering operation of the whole subsystem.  Work is in
progress to move to data-based locking as part of the NFSv4 server
changes.

Introduce an NFS server subsystem lock, 'nfsd_mtx', and a set of
macros to operate on the lock:

  NFSD_LOCK_ASSERT()    Assert nfsd_mtx owned by current thread
  NFSD_UNLOCK_ASSERT()  Assert nfsd_mtx not owned by current thread
  NFSD_LOCK_DONTCARE()  Advisory: this function doesn't care
  NFSD_LOCK()           Lock nfsd_mtx
  NFSD_UNLOCK()         Unlock nfsd_mtx

Constify a number of global variables/structures in the NFS server
code, as they are not modified and contain constants only:

  nfsrvv2_procid       nfsrv_nfsv3_procid      nonidempotent
  nfsv2_repstat        nfsv2_type              nfsrv_nfsv3_procid
  nfsrvv2_procid       nfsrv_v2errmap          nfsv3err_null
  nfsv3err_getattr     nfsv3err_setattr        nfsv3err_lookup
  nfsv3err_access      nfsv3err_readlink       nfsv3err_read
  nfsv3err_write       nfsv3err_create         nfsv3err_mkdir
  nfsv3err_symlink     nfsv3err_mknod          nfsv3err_remove
  nfsv3err_rmdir       nfsv3err_rename         nfsv3err_link
  nfsv3err_readdir     nfsv3err_readdirplus    nfsv3err_fsstat
  nfsv3err_fsinfo      nfsv3err_pathconf       nfsv3err_commit
  nfsrv_v3errmap

There are additional structures that should be constified but due
to their being passed into general purpose functions without const
arguments, I have not yet converted.

In general, acquire nfsd_mtx when accessing any of the global NFS
structures, including struct nfssvc_sock, struct nfsd, struct
nfsrv_descript.

Release nfsd_mtx whenever calling into VFS, and acquire Giant for
calls into VFS.  Giant is not required for any part of the
operation of the NFS server with the exception of calls into VFS.
Giant will never by acquired in the upcall code path.  However, it
may operate entirely covered by Giant, or not.  If debug.mpsafenet
is set to 0, the system calls will acquire Giant across all
operations, and the upcall will assert Giant.  As such, by default,
this enables locking and allows us to test assertions, but should not
cause any substantial new amount of code to be run without Giant.
Bugs should manifest in the form of lock assertion failures for now.

This approach is similar (but not identical) to modifications to the
BSD/OS NFS server code snapshot provided by BSDi as part of their
SMPng snapshot.  The strategy is almost the same (single lock over
the NFS server), but differs in the following ways:

- Our NFS client and server code bases don't overlap, which means
  both fewer bugs and easier locking (thanks Peter!).  Also means
  NFSD_*() as opposed to NFS_*().

- We make broad use of assertions, whereas the BSD/OS code does not.

- Made slightly different choices about how to handle macros building
  packets but operating with side effects.

- We acquire Giant only when entering VFS from the NFS server daemon
  threads.

- Serious bugs in BSD/OS implementation corrected -- the snapshot we
  received was clearly a work in progress.

Based on ideas from:	BSDi SMPng Snapshot
Reviewed by:		rick@snowhite.cis.uoguelph.ca
Extensive testing by:	kris
2004-05-24 04:06:14 +00:00
Maxime Henrion
7cc35e41e7 Don't send the available space as is in the FSSTAT call. Under
FreeBSD, we can have a negative available space value, but the
corresponding fields in the NFS protocol are unsigned.  So
trnucate the value to 0 if it's negative, so that the client
doesn't receive absurdly high values.

Tested by:	cognet
2004-04-12 13:02:21 +00:00
Peter Edwards
ae00154c4b Don't let the NFS server module be unloaded as long as there are
nfsd processes running

Reviewed By:	iedowse
PR:		16299
2004-04-11 13:33:34 +00:00
Warner Losh
2fcbca0d85 Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson
2004-04-07 05:00:01 +00:00
Robert Watson
c1aec06a7c Add imperfect comments identifying the function of various nfs socket
condition flags.  Corrections, if appropriate, welcome.
2004-04-06 01:58:58 +00:00
Robert Watson
ecd189d420 Spell 2 as SHUT_RDWR when used as an argument to soshutdown(). 2004-04-04 19:24:08 +00:00
Robert Watson
2a8021f14b Explicitly compare pointers with NULL rather than treating a pointer as
a boolean directly, use NULL instead of 0.
2004-04-04 19:13:35 +00:00
Peter Wemm
6df0617286 Calculate NFS timeouts in units of 10ms, not 5ms. This matches the default
clock precision on i386.  This is a NOP change on i386.  But this stops
the mount_nfs units from suddenly changing to units of 1/20 of a second
(vs the normal 1/10 of a second) if HZ is increased.
2004-03-14 06:21:56 +00:00
Poul-Henning Kamp
4d453ef101 Properly vector all bwrite() and BUF_WRITE() calls through the same path
and s/BUF_WRITE()/bwrite()/ since it now does the same as bwrite().
2004-03-11 18:02:36 +00:00
Alexander Kabaev
fa3d2a12df Convert from timeout to callout API.
Submitted by: rwatson
2004-03-07 16:28:31 +00:00
Robert Watson
746e5bf09b Rename dup_sockaddr() to sodupsockaddr() for consistency with other
functions in kern_socket.c.

Rename the "canwait" field to "mflags" and pass M_WAITOK and M_NOWAIT
in from the caller context rather than "1" or "0".

Correct mflags pass into mac_init_socket() from previous commit to not
include M_ZERO.

Submitted by:	sam
2004-03-01 03:14:23 +00:00
John Baldwin
a5b061f9d2 Fix some becuase -> because typos.
Reported by:	Marco Wertejuk <wertejuk@mwcis.com>
2003-12-17 16:12:01 +00:00
Robert Watson
8accd36ef0 Update a comment about needing to fix NFS server credential use
by 5.0-RELEASE: make it now read 5.3-RELEASE to be realistic.  Still
needs fixing...
2003-11-17 00:56:53 +00:00
Sam Leffler
a96756932a Assert GIANT_REQUIRED where sockets are manipulated. This is
preparatory for MPSAFE network commits and ongoing socket
locking work.

Supported by:	FreeBSD Foundation
2003-11-07 22:57:09 +00:00
Poul-Henning Kamp
63b92d134e When grabbing vnodes to service NFS requests, make sure to call
vn_start_write() early to avoid snapshot deadlocks.

By:	mckusick
2003-10-24 18:36:49 +00:00
Jeff Roberson
3c92540393 - Set the sopt_dir member of the sockopt structure, otherwise, this parameter
will not actually be set even though we're calling sosetopt.  sosetopt
   calls down to a single ctloutput function if the name or level is
   implemented by a specific protocol.

Submitted by:	pete@isilon.com
2003-10-04 17:37:51 +00:00
Poul-Henning Kamp
4a12d8397f Change idle state sleep identifier to "-" for nfsd. 2003-07-02 08:08:32 +00:00
Ian Dowse
92daf89227 Fix a bug in nfsrv_read() that caused the replies to certain NFSv3
short read operations at the end of a file to not have the "eof"
flag set as they should. The problem is that the requested read
count was compared against the rounded-up reply data length instead
of the actual reply data length. This bug appears to have been
introduced in revision 1.78 (June 1999). It causes first-time reads
of certain file sizes (e.g 4094 bytes) to fail with EIO on a RedHat
9.0 NFSv3 client.

MFC after:	1 week
2003-06-24 19:04:26 +00:00
Kirk McKusick
98530110a2 Increase the size of the NFS server hash table to improve performance
when serving up more than about 32 active files. For details see
section 6.3 (pg 111) of Daniel Ellard and Margo Seltzer, ``NFS
Tricks and Benchmarking Traps'' in the Proceedings of the Usenix
2003 Freenix Track, June 9-14, 2003 pg 101-114.

Obtained from:	Daniel Ellard <ellard@eecs.harvard.edu>
Sponsored by:   DARPA & NAI Labs.
2003-06-21 21:01:44 +00:00
David E. O'Brien
ab0de15baf Use __FBSDID(). 2003-06-11 05:37:42 +00:00
Jeffrey Hsu
30ef7aacc2 Protect read-modify-write increment of f_count field with file lock. 2003-06-05 06:05:57 +00:00
Poul-Henning Kamp
7379c88f4f Add /* FALLTHROUGH */
Found by:       FlexeLint
2003-05-31 18:20:26 +00:00
Don Lewis
263c8abeb9 Beat vnode locking in the NFS server code into submission. This change
is not pretty, but it fixes the code so that it no longer violates the
vnode locking rules in the VFS API and doesn't trip any of the locking
assertions enabled by the DEBUG_VFS_LOCKS kernel configuration option.
There is one report that this patch fixed a "locking against myself"
panic on an NFS server that was tripped by a diskless client.

Approved by:	re (scottl)
2003-05-25 06:17:33 +00:00
Alan Cox
b6e48e0372 - Acquire the vm_object's lock when performing vm_object_page_clean().
- Add a parameter to vm_pageout_flush() that tells vm_pageout_flush()
   whether its caller has locked the vm_object.  (This is a temporary
   measure to bootstrap vm_object locking.)
2003-04-24 04:31:25 +00:00
Jeff Roberson
c033bdc013 - Lock bufs before inspecting their flags. 2003-03-13 07:05:22 +00:00
Dag-Erling Smørgrav
521f364b80 More low-hanging fruit: kill caddr_t in calls to wakeup(9) / [mt]sleep(9). 2003-03-02 16:54:40 +00:00
Jeff Roberson
17661e5ac4 - Add an interlock argument to BUF_LOCK and BUF_TIMELOCK.
- Remove the buftimelock mutex and acquire the buf's interlock to protect
   these fields instead.
 - Hold the vnode interlock while locking bufs on the clean/dirty queues.
   This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another
   BUF_LOCK with a LK_TIMEFAIL to a single lock.

Reviewed by:	arch, mckusick
2003-02-25 03:37:48 +00:00
Poul-Henning Kamp
1c7d84ed3c Don't use mbuf allocator flags for malloc(9). 2003-02-22 10:35:37 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Matthew Dillon
48e3128b34 Bow to the whining masses and change a union back into void *. Retain
removal of unnecessary casts and throw in some minor cleanups to see if
anyone complains, just for the hell of it.
2003-01-13 00:33:17 +00:00
Matthew Dillon
cd72f2180b Change struct file f_data to un_data, a union of the correct struct
pointer types, and remove a huge number of casts from code using it.

Change struct xfile xf_data to xun_data (ABI is still compatible).

If we need to add a #define for f_data and xf_data we can, but I don't
think it will be necessary.  There are no operational changes in this
commit.
2003-01-12 01:37:13 +00:00
Jens Schweikhardt
9d5abbddbf Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup,
especially in troff files.
2003-01-01 18:49:04 +00:00
Matthew Dillon
45587e2514 Abstract-out the constants for the sequential heuristic.
No operational changes.

MFC after:	1 day
2002-12-28 20:28:10 +00:00
Ian Dowse
2f07688e82 In the NFSv3 `fsinfo' procedure reply, don't claim that we support
32k read and write operations on datagram sockets when in fact we
reject requests larger than 16k. It must be the case that virtually
all clients use data sizes of 16k or less for UDP transport (FreeBSD's
client defaults to 8k and never exceeds 16k), as this bug has been
present ever since NFSv3 support was added.

Reported by:	Senthil <lihtnes78@netscape.net>
Reviewed by:	dillon
Approved by:	re
MFC-after:	1 week
2002-12-05 16:58:11 +00:00
Robert Watson
e5e820fd1f Permit MAC policies to instrument the access control decisions for
system accounting configuration and for nfsd server thread attach.
Policies might use this to protect the integrity or confidentiality
of accounting data, limit the ability to turn on or off accounting,
as well as to prevent inappropriately labeled threads from becoming nfs
server threads.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-04 15:13:36 +00:00
Jeff Roberson
24b50116ed - Introduce a new macro, since that's what nfs loves, called
nfsm_srvpathsiz.  This macro plucks a length out of an rpc request and
   verifies that its size does not exceed NFS_MAXPATHLEN.  If it does
   it generates an ENAMETOOLONG response.
 - Use this macro, and the existing nfsm_srvnamsiz macro in two places
   where we deal with paths passed in by the client.

This fixes a linux interoperability bug.  Linux was sending oversized path
components which would cause us to ignore the request all together.  This
causes linux to hang indefinitly while it waits for a response.  This
could still happen in other cases where we error out with EBADRPC.

Sponsored by:	Isilon Systems, Inc.
Reviewed by:	alfred, fabbri@isilon.com, neal@isilon.com
2002-10-31 22:35:03 +00:00
Robert Watson
94998f80fe Set the NOMACCHECK flag for namei()'s generated by the NFS server code.
We currently don't enforce protections on NFS-originated VOP's.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-19 21:27:40 +00:00
Robert Watson
60cfb7c64a Correct a problem wherein NFS servers running NFSv2 would not return
certain classes of failure responses to the client during a failed
remove operation.

Submitted by:	Ian Dowse <iedowse@maths.tcd.ie>
2002-10-03 21:50:37 +00:00
Jeff Roberson
d3b85e1c8b - Use incore() instead of gbincore() so we don't have to acquire the
vnode interlock.
2002-09-25 02:39:39 +00:00
Poul-Henning Kamp
7ed60de837 Use m_length() instead of home-rolled versions. 2002-09-18 19:44:14 +00:00
Poul-Henning Kamp
f5cd3d67fe Make the V2 errno translation more resistent to new errnos. 2002-08-21 19:28:44 +00:00
Jeff Roberson
e6e370a7fe - Replace v_flag with v_iflag and v_vflag
- v_vflag is protected by the vnode lock and is used when synchronization
   with VOP calls is needed.
 - v_iflag is protected by interlock and is used for dealing with vnode
   management issues.  These flags include X/O LOCK, FREE, DOOMED, etc.
 - All accesses to v_iflag and v_vflag have either been locked or marked with
   mp_fixme's.
 - Many ASSERT_VOP_LOCKED calls have been added where the locking was not
   clear.
 - Many functions in vfs_subr.c were restructured to provide for stronger
   locking.

Idea stolen from:	BSD/OS
2002-08-04 10:29:36 +00:00
Peter Wemm
c86e55f6f0 Oops, another unused arg to nfssvc_nfsd(). *blush*
Submitted by:	jake
2002-07-24 23:10:34 +00:00
Peter Wemm
28cc58165d Fully exterminate nfsd_srvargs and nfsd_cargs. They were either unused
or giant NOP's.  There was a credential in srvargs that was giving
rwatson some heartburn. :-)
2002-07-24 22:27:35 +00:00
Robert Watson
30268a208b Stick a dark comment in about the fact that the NFS server code allocates
a ucred by itself as part of an nfs descriptor, then bzero's the ucred,
fails to initialize the mutex, etc.  This is very bad, but I don't have
time to fix it right now.  nfsd should instead hold a cred pointer,
and the credential should be properly initialized, probably from a
descendent of a kernel process credential.
2002-07-24 14:24:16 +00:00
Hajimu UMEMOTO
506c29ecf3 sync comment with reality. IN6P_BINDV6ONLY -> IN6P_IPV6_V6ONLY. 2002-07-22 15:55:50 +00:00
Matthew Dillon
a96f7d1a1b 'recm' was not being unconditionally cleared for each loop, leading to
system lockups (infinite loops) when a zero-length RPC is received.
Linux clients will sometimes send zero-length RPC requests.

Reorganize the use of recm in the loop.

Cc: security@freebsd.org
Submitted by:	Mike Junk <junk@isilon.com>
MFC after:	3 days
2002-07-17 01:07:08 +00:00
Alfred Perlstein
09ce4f7aaf Add IPv6 support.
Submitted by: Jean-Luc Richier <Jean-Luc.Richier@imag.fr>
2002-07-15 19:40:23 +00:00
Matthew Dillon
3d8f797ac1 Convert old style (type foo *)0 casts to NULLs
PR:		kern/40360
Requested by:	Hiten PAndya via direct email
2002-07-11 17:54:58 +00:00