Commit Graph

760 Commits

Author SHA1 Message Date
rwatson
9a0e4c7010 Merge nfs_nfsiod.c:1.89 from HEAD to RELENG_6:
Adjust minimum iod threads from 4 to 0 -- since we compile the NFS
  client into the kernel by default, and many users won't use NFS,
  don't start an extra 4 kernel threads that are unused.  Once NFS
  becomes active, it will start nfsiod's as it needs them.

  We might consider mandating a minimum iod's equal to the number of
  active NFS mounts (truncated to some value), which would force some
  to remain available without having to create a new one if the file
  system is mostly inactive.

  PR:             70880
  Prodded by:     cel
  Head nod:       peter
  Pointed out by: Joe <fbsd_user at a1poweruser dot com>
2006-06-08 22:57:07 +00:00
cel
f84994a70d NFS over TCP retransmit behavior should default to a 60 second time out,
mimicing the NFS reference implementation.

NFS over TCP does not need fast retransmit timeouts, since network loss
and congestion are managed by the transport (TCP), unlike with NFS over
UDP.  A long timeout prevents the unnecessary retransmission of non-
idempotent NFS requests.

Reviewed by:	mohans, silby, rees?
Sponsored by:	Network Appliance, Incorporated
2006-05-30 01:52:59 +00:00
cel
4ec879514b Refactor the NFS over UDP retransmit timeout estimation logic to allow
the estimator to be more easily tuned and maintained.

There should be no functional change except there is now a lower limit
on the retransmit timeout to prevent the client from retransmitting
faster than the server's disks can fill requests, and an upper limit
to prevent the estimator from taking too long to retransmit during a
server outage.

Reviewed by:	mohan, kris, silby
Sponsored by:	Network Appliance, Incorporated
2006-05-30 00:43:07 +00:00
delphij
dfb738e5a6 MFC src/sys/nfsclient/nfs_bio.c,v 1.154
and src/sys/nfsclient/nfs_vnops.c,v 1.262 (by ps@):

 - Always return success from NFS strategy. nfs_doio(), in the
   event of an error, does the right thing, in terms of setting
   the error flags in the buf header. That fixes a crash from
   bstrategy().
 - Treat ETIMEDOUT as a "recoverable" error, causing the buffer
   to be re-dirtied. ETIMEDOUT can occur on soft mounts, when
   the number of retries are exceeded, and we don't want data loss
   in that case.

Submitted by:   Mohan Srinivasan
Approved by:	re (scottl)
2006-04-18 05:31:58 +00:00
jon
73f7f6d707 MFC 1.261 - fix a crash when an nfsv2 mount fails
Approved by:	re
2006-04-18 05:18:47 +00:00
cel
8c36fd3864 If an NFS server returns more than a few EJUKEBOX errors for a given RPC
request, the FreeBSD NFS client will quickly back off to a excessively
long wait (days, then weeks) before retrying the request.

Change the behavior of the FreeBSD NFS client to match the behavior of
the reference NFS client implementation (Solaris).  This provides a fixed
delay of 10 seconds between each retry by default.  A sysctl, called
nfs3_jukebox_delay, is now available to tune the delay.  Unlike Solaris,
the sysctl value on FreeBSD is in seconds, rather than in HZ.

MFC revision 1.136 to RELENG_6

Sponsored by:   Network Appliance, Incorporated
Reviewed by:    rick
Approved by:    re (kensmith), silby
2006-04-02 04:11:23 +00:00
kris
c506d5366a MFC r1.137:
Fix a bug in the NFS/TCP retransmission path.

The bug was that earlier, if a request was retransmitted,
we would do subsequent retransmits every 10 msecs.

This can cause data corruption under moderate loads by reordering
operations as seen by the client NFS attribute cache, and on the
server side when the retransmission occurs after the original request
has left the duplicate cache, since the operation will be committed
for a second time.

Further work on retransmission handling is needed (e.g. they are still
being done sent too often since they are scaled by HZ, and the size of
the dup cache is too small and easily overwhelmed on busy servers).

Submitted by:   mohans
Approved by:	re (mux)
2006-03-31 07:13:09 +00:00
cel
cfec640a25 Fix a bug in NFSv3 READDIRPLUS reply processing
The client's READDIRPLUS logic skips the attributes and
filehandle of the ".." entry.  If the server doesn't send
attributes but does send a filehandle for "..", the
client's logic doesn't account for the extra "value
Fix a bug in NFSv3 READDIRPLUS reply processing

The client's READDIRPLUS logic skips the attributes and
filehandle of the ".." entry.  If the server doesn't send
attributes but does send a filehandle for "..", the
client's logic doesn't account for the extra "value
follows" field that indicates whether the filehandle is
present, causing the remaining entries in the reply
to be ignored.

This is an MFC of 1.264 in the CURRENT branch.

Sponsored by:   Network Appliance, Inc.
Reviewed by:    rick, mohans
Approved by:    re, silby
2006-03-29 18:11:32 +00:00
delphij
6b5f6d40b5 MFC 1.263: a typo fix (diff reduction against -HEAD)
Approved by:	re (hrs)
2006-03-24 04:48:42 +00:00
pjd
84853dde8e MFC: sys/nfsclient/nfs_diskless.c 1.15
I wanted 'nolockd' here instead of 'lockd'.

Approved by:	re (mux)
2006-03-20 15:45:14 +00:00
scottl
c5719df4a5 MFC: Call vfs_destroy_object() before v_data gets set to NULL.
Approved by: re
2006-03-12 21:50:02 +00:00
pjd
8d7bed0cec MFC: sys/nfsclient/nfs_diskless.c 1.12,1.13
Add boot.nfsroot.options loader tunable.
It allows to specify options for NFS root file system.
Currently supported options are: soft, intr, conn, lockd.

I'm adding this functionality mostly for 'lockd' option, which is only
honored when performing the initial mount and will be silently ignored
if used while updating the mount options.

This will allow to use flock(2) without the need of using varmfs or
rpc.lockd and friends.

Example of use:
boot.nfsroot.options="intr,lockd"

Approved by:	re (scottl)
2006-03-01 18:01:28 +00:00
yar
dbcb706f58 Work around the shortness of the size argument to
vnode_create_vobject() while preserving the binary ABI
to filesystem modules in RELENG_6: introduce a new function
vnode_create_vobject_off() that takes the size argument
as off_t; move all stock file systems to it; re-implement
the old vnode_create_vobject() using vnode_create_vobject_off()
so that old or binary-only FS modules can work w/o hitting the
bug.  The trick is to pass a size of 0 to vnode_create_vobject_off()
so that it will call VOP_GETATTR() and thus get the actual,
untruncated file size even if the calling module still uses
the old vnode_create_vobject().

PR:		kern/92243
Approved by:	re (scottl)
2006-02-20 00:53:15 +00:00
rees
24c9cc5118 MFC rev 1.135:
Don't log an error on tcp connection reset, even if we don't get ECONNRESET.

Submitted by:	cel@citi.umich.edu
Approved by:	re (scottl)
2006-02-16 02:39:52 +00:00
rwatson
f017c618f0 Merge nfs_lock.c:1.43 from HEAD to RELENG_6:
In nfs_dolock(), GC now under-used ioflg, rendered obsolete when we moved
  from using a fifo to talk to rpc.lockd to using a special device node.

Approved by:	re (scottl)
2006-02-14 00:06:32 +00:00
tegge
81ceadf72a MFC: Add marker vnodes to ensure that all vnodes associated with the mount
point are iterated over when using MNT_VNODE_FOREACH.
2006-01-14 01:18:03 +00:00
maxim
b08fccbd06 MFC rev. 1.134: fix for a bug where NFS/TCP would
not reconnect (in the case where the server FIN'ed).

PR:		kern/88833
Requested by:	Roman V. Palagin
Approved by:	Mohan Strinivasan
2005-12-15 18:10:37 +00:00
rees
94b8aef59a MFC: nfs_socket.c 1.132, nfs_subs.c 1.142, nfsm_subs.h 1.37
fix a problem with XID re-use when a server returns NFSERR_JUKEBOX.
2005-12-13 21:29:26 +00:00
delphij
88a8009c9a MFC 1.260 (by ps): Fixed a panic that can happen when nfs_lookup() hits
an error.

RELENG_6_0 errata candidate.
2005-11-25 13:27:22 +00:00
glebius
42def59c5e MFC:
- Fix leak of struct nlminfo on process exit.
  - Fix malloc type collision, that made the above problem
    difficult to understand.

  Reported by:	Vladimir Sharun <sharun ukr.net>

Approved by:	re (scottl)
2005-10-27 18:35:19 +00:00
delphij
31cd49528b MFC (by ps):
| Fixes for NFS crashes on architectures that require strict alignment.
| - Fix nfsm_disct() so that after pulling up data, the remaining data
|   is aligned if necessary.
| - Fix nfs_clnt_tcp_soupcall() to bcopy() the rpc length out of the
|   mbuf (instead of casting m_data to a uint32).
|
| Submitted by:   Pyun YongHyeon
| Reviewed by:    Mohan Srinivasan
|
| Revision  Changes    Path
| 1.118     +12 -3     src/sys/nfs/nfs_common.c
| 1.38      +6 -0      src/sys/nfs/nfs_common.h
| 1.126     +2 -1      src/sys/nfsclient/nfs_socket.c

Approved by:	re (scottl)
2005-10-09 03:21:56 +00:00
delphij
373205352c MFC (by ps)
| In nfs_nget() if two threads race on the same filehandle, the loser
| should cause the nfsnode to get freed. This fixes a potential vnode
| (and nfsnode) leak in that path.
|
| Submitted by:   Mohan Srinivasan
| Reviewed by:    phk
|
| Revision  Changes    Path
| 1.78      +2 -1      src/sys/nfsclient/nfs_node.c

Approved by:	re (scottl)
2005-10-09 03:15:36 +00:00
rwatson
5db6e492ee Merge subr_prof.c:1.119, 1.120, 1.121, nfs_socket.c:1.130,
rpcclnt.c:1.14 from HEAD to RELENG_6:

Acquire Giant in uprintf() and tprintf() due to the non-MPSAFEty of
the tty code invoked from these functions.  In two cases, during
timeout handling in NFS-related RPC client code, acquire Giant in
the caller before other mutexes the caller might hold, in order to
avoid lock order reversals with Giant (a recursive acquire is not
a reversal as it won't ever wait).

Correct age-old comments about uprintf()/tprintf() sleeping: they
will never sleep.

Much useful feedback from:	bde
Approved by:			re (scottl)
2005-09-29 18:40:36 +00:00
ps
21b8a6b2f6 MFC: rev 1.127
Fix for a NFS soft mounts bug where if the number of retries exceeds
the max rexmits, the request was not being bounced back with a
ETIMEDOUT error.

Approved by:	re
2005-07-21 16:19:02 +00:00
green
b4b5044eed Ifdef out the incomplete non-blocking IO implementation for NFS
pending discussion of how implementation would proceed.  Applications
like -lc_r expect select(3) to match the EAGAIN-status of IO
functions.

Approved by:	re
2005-06-16 15:43:17 +00:00
green
ff904ffb64 Fix a serious deadlock with the NFS client. Given a large enough
atomic write request, it can fill the buffer cache with the entirety
of that write in order to handle retries.  However, it never drops
the vnode lock, or else it wouldn't be atomic, so it ends up waiting
indefinitely for more buf memory that cannot be gotten as it has it
all, and it waits in an uncancellable state.

To fix this, hibufspace is exported and scaled to a reasonable
fraction.  This is used as the limit of how much of an atomic write
request by the NFS client will be handled asynchronously.  If the
request is larger than this, it will be turned into a synchronous
request which won't deadlock the system.  It's possible this value is
far off from what is required by some, so it shall be tunable as soon
as mount_nfs(8) learns of the new field.

The slowdown between an asynchronous and a synchronous write on NFS
appears to be on the order of 2x-4x.

General nod by:	gad
MFC after:	2 weeks
More testing:	wes
PR:		kern/79208
2005-06-10 23:50:41 +00:00
des
0bbbcadeb1 Ugh. Previous commit got the logic exactly backward.
Submitted by:	bland
Pointy hat to:	des
2005-05-17 18:23:03 +00:00
des
d3a9750001 Revision 1.173 broke updating a mount from ro to rw. Fix that by clearing
the MNT_RDONLY flag if MNT_UPDATE is set and "ro" was not specified.

Suggested by:	cognet
2005-05-17 12:00:43 +00:00
rees
59c5573379 set R_MUSTRESEND flag in mark_for_reconnect so re-connected requests get
re-sent instead of timing out.

don't log an error message on reconnection, which is not an error.

remove unused nfs_mrep_before_tsleep.

Reviewed by:	Mohan Srinivasan
Approved by:	alfred
2005-05-10 14:25:14 +00:00
ps
40a0d434da Fix a bug in NFS/TCP where retransmissions would not reliably happen
if the server rebooted or tore down the connection for any reason.

Found by:	Jonathan Noack.
Submitted by:	Mohan Srinivasan.
2005-05-04 16:37:31 +00:00
iedowse
2593dad93c Don't copy the NFSMNT_* flags into struct statfs's f_flags field,
as they have no connection with the expected MNT_* flags. This bug
was exposed 18 months ago when the assignments to f_flags in
vfs_syscalls.c were moved to before the VFS_STATFS() call. It was
fixed in the CSRG source 10 years ago, but we never picked up that
change.

PR:		kern/80390
MFC after:	1 week
2005-05-02 15:57:10 +00:00
des
5e15cfc3fa When NFS was converted to the new mount syscall, code was written that sets
the MNT_RDONLY flag if the "ro" option was passed in from userland, and
clears it otherwise.  In the diskless case, the MNT_RDONLY flag is already
set when this code is reached, but there are no mount options, so it was
incorrectly cleared.  Change the logic so the MNT_RDONLY flag is set if the
"ro" option was specified, and left alone otherwise.

Note that the NFS code will still happily let you mount a filesystem RW
even if the server exports it RO.  I'm not sure how to fix that.
2005-04-27 14:46:02 +00:00
des
de2d951ab7 While I'm here, list the new kenv (boot.netif.name) along with the others. 2005-04-26 20:47:59 +00:00
des
37881dde0f When netbooting, as soon as we've figured out which interface we booted
from, store its name in a kenv variable.
2005-04-26 20:45:29 +00:00
rees
3e9035accc TCP reconnect is not an error.
Change the message from LOG_ERR to LOG_INFO.

Approved by:	alfred
2005-04-18 13:42:13 +00:00
jeff
e4eab9fb69 - cache_lookup() relocks the parent in the DOTDOT case for us.
Spotted by:	phk
Sponsored by:	Isilon Systems, Inc.
2005-04-14 07:08:34 +00:00
jeff
afab3762a0 - Change all filesystems and vfs_cache to relock the dvp once the child is
locked in the ISDOTDOT case.  Se vfs_lookup.c r1.79 for details.

Sponsored by:	Isilon Systems, Inc.
2005-04-13 10:59:09 +00:00
jeff
97c40ebd49 - LK_NOPAUSE is a nop now.
Sponsored by:   Isilon Systems, Inc.
2005-03-31 04:37:09 +00:00
jeff
ca1e4c2fe0 - Remove wantparent, it is no longer necessary. An assert in vfs_lookup.c
prevents any callers from doing a modifying op without
   LOCKPARENT or WANTPARENT.
2005-03-29 13:09:42 +00:00
jeff
141aba2c7b - cache_lookup() now locks the new vnode for us to prevent some races.
Remove redundant code.

Sponsored by:	Isilon Systems, Inc.
2005-03-29 13:00:37 +00:00
jeff
5f8bc80203 - We no longer have to bother with PDIRUNLOCK, lookup() handles it for us.
- Network filesystems are written with a special idiom that checks the
   cache first, and may even unlock dvp before discovering that a network
   round-trip is required to resolve the name.  I believe dvp is prevented
   from being recycled even in the forced unmount case by the shared lock
   on the mount point.  If not, this code should grow checks for VI_DOOMED
   after it relocks dvp or it will access NULL v_data fields.

Sponsored by:	Isilon Systems, Inc.
2005-03-28 09:29:58 +00:00
jeff
56f1fc7189 - Update vfs_root implementations to match the new prototype. None of
these filesystems will support shared locks until they are explicitly
   modified to do so.  Careful review must be done to ensure that this
   is safe for each individual filesystem.

Sponsored by:   Isilon Systems, Inc.
2005-03-24 07:39:03 +00:00
ps
114057c633 - The NFS client was incorrectly masking SIGSTOP (which is
non-maskable).
- The NFS client needs to guard against spurious wakeups
  while waiting for the response. ltrace causes the process
  under question to wakeup (possibly from ptrace()), which
  causes NFS to wakeup from tsleep without the response being
  delivered.

Submitted by:	Mohan Srinivasan
2005-03-23 22:10:10 +00:00
das
89bc04ad2d Don't brelse(bp) if bp is null. Also, eliminate some redundancy
and dead code.

Found by:	Coverity Prevent analysis tool
2005-03-18 21:23:32 +00:00
phk
172eba2632 Use vfs_hash. 2005-03-16 11:28:19 +00:00
jmg
64c69bfb4e MFp4: use the function to fix the packet header length instead of rolling
our own...
2005-03-16 08:13:08 +00:00
jeff
29a4f75b9b - VOP_INACTIVE should no longer drop the vnode lock.
Sponsored by:	Isilon Systems, Inc.
2005-03-13 12:15:36 +00:00
jeff
5bd51ec6e6 - The VI_DOOMED flag now signals the end of a vnode's relationship with
the filesystem.  Check that rather than VI_XLOCK.

Sponsored by:	Isilon Systems, Inc.
2005-03-13 12:14:56 +00:00
jeff
5f59e0cd19 - It is no longer necessary to lock and unlock the vnode in nfs_close() as
the top level does this for us now.

Sponsored by:	Isilon Systems, Inc.
2005-03-13 12:11:23 +00:00
ps
d4a5a3bc89 Minor cleanup in nfs_request() and removal of a comment that doesn't
reflect reality.

Submitted by:	Mohan Srinivasan
2005-02-26 18:55:36 +00:00