775 Commits

Author SHA1 Message Date
jhb
cb722c46dc MFC: Do not set B_NOCACHE on buffers when releasing them in flushbuflist().
If B_NOCACHE is set the pages of vm backed buffers will be invalidated.
However clean buffers can be backed by dirty VM pages so invalidating them
can lead to data loss.
Add support for flush dirty page in the data invalidation function
of some network file systems.

This fixes data losses during vnode recycling (and other code paths
using invalbuf(*,V_SAVE,*,*)) for data written using an mmaped file.
2007-02-12 19:08:29 +00:00
mohans
efe37ef4d8 Add missing MNT_ILOCK around some mnt_kern_flag accesses. 2007-02-11 03:43:34 +00:00
mohans
f035099ad1 MFC:
Fixes up the handling of shared vnode lock lookups in the NFS client,
adds a FS type specific flag indicating that the FS supports shared
vnode lock lookups, adds some logic in vfs_lookup.c to test this flag
and set lock flags appropriately.

This change fixes the general problem of cascading vnode locks when an
NFS server goes down.

Ideally, we wouldn't need these changes, as enabling shared vnode lock
lookups globally would work. Unfortunately, UFS, for example isn't
ready for shared vnode lock lookups, crashing pretty quickly.

This change is the result of discussions with Stephan Uphoff (ups@).
Thanks to Kris for shaking out several bugs in NFS with shared vnode
lock lookups in current. MFC'ed per Kris' request.

Reviewed by:	ups@
2007-02-11 03:07:46 +00:00
mohans
77c4a0dce1 MFC: Fix for a vnode lock leak in nfs_create() in the event of an error.
Spotted by ups@.
2007-01-31 23:11:15 +00:00
mohans
7f2590b5d0 MFC 3 fixes from -current. All having to do with the case where the same
filehandle is looked up by 2 or more processes.
- Don't vrele() the losing vnode, as vfs_hash_insert() vput()'s it.
- Initialize mutexes on the losing nfsnode (as these get destroyed in the
  nfsnode reclaim path).
- Move the initialization of the filehandle to before the vfs_insert, to
  close some races which could result in multiple vnodes for the same
  filehandle being inserted into the hash.
2007-01-03 20:19:02 +00:00
sam
3a586a3803 MFC 1.67: honor nolockd flag in root mount options 2006-12-23 22:40:56 +00:00
mohans
7aaff56599 MFC :
Fix to readdir+ reply handling. When inserting an entry into the namecache,
initialize the nfsnode's ctime. Otherwise a subsequent lookup purges the
just entered namecache entry.
Approved by: re
2006-12-05 18:41:35 +00:00
bde
628d29a473 MFC (1.270: don't do null Setattr RPCs for VA_MARK_ATIME). 2006-11-23 09:50:18 +00:00
mohans
b4043fdc19 MFC: Make EWOULDBLOCK a recoverable error so that the request is
retransmitted. This bug results in data corruption. Writes are
silently dropped on EWOULDBLOCK (caused because socket send buffer is
full and sockbuf timer fires - with NFS/TCP).
Reviewed by: ups@
Approved by: re
2006-11-02 19:48:17 +00:00
tegge
690853d66d MFC: Use mount interlock to protect all changes to mnt_flag and
mnt_kern_flag. This eliminates a race where MNT_UPDATE flag could be
     lost when nmount() raced against sync(), sync_fsync() or quotactl().

Approved by:	re (kensmith)
2006-10-09 19:47:17 +00:00
mohans
64c133521a MFC change 1.138.
Fix for a NFS/TCP client bug which would cause the NFS/TCP stream to get
out of sync under heavy loads, forcing frequent reconnets, causing EBADRPC
errors etc.

Approved by: re
2006-10-01 05:03:18 +00:00
mohans
2f6828bb9f MFC:
Vnode locks are recursive and the NFS client support shared vnode locks.

Approved by: re
2006-09-13 19:25:44 +00:00
brooks
039ba93037 MFC: rev 1.185
Add a new kernel environment variable "boot.netif.mtu" which is used to
set the MTU prior to mounting root via NFS.  This is required if the
server supports a higher than default MTU because the client will not
see the responses otherwise.
2006-09-07 17:38:47 +00:00
kib
2f7d13c770 MFC rev. 1.267:
Always supply curthread as argument to nfs_asyncio and nfs_doio
in nfs_strategy. Otherwise, for some buffers, signals would be ignored
at the intr mounts.

Reviewed by:	mohan
Approved by:	pjd (mentor)
2006-08-07 12:33:25 +00:00
kib
883b286035 MFC rev. 1.142:
Signals may be delivered to process as well as to the thread. Check the
thread-delivered signals in addition to the process one.

Reviewed by:	mohan
Approved by:	pjd (mentor)
2006-08-07 12:32:10 +00:00
rwatson
9a0e4c7010 Merge nfs_nfsiod.c:1.89 from HEAD to RELENG_6:
Adjust minimum iod threads from 4 to 0 -- since we compile the NFS
  client into the kernel by default, and many users won't use NFS,
  don't start an extra 4 kernel threads that are unused.  Once NFS
  becomes active, it will start nfsiod's as it needs them.

  We might consider mandating a minimum iod's equal to the number of
  active NFS mounts (truncated to some value), which would force some
  to remain available without having to create a new one if the file
  system is mostly inactive.

  PR:             70880
  Prodded by:     cel
  Head nod:       peter
  Pointed out by: Joe <fbsd_user at a1poweruser dot com>
2006-06-08 22:57:07 +00:00
cel
f84994a70d NFS over TCP retransmit behavior should default to a 60 second time out,
mimicing the NFS reference implementation.

NFS over TCP does not need fast retransmit timeouts, since network loss
and congestion are managed by the transport (TCP), unlike with NFS over
UDP.  A long timeout prevents the unnecessary retransmission of non-
idempotent NFS requests.

Reviewed by:	mohans, silby, rees?
Sponsored by:	Network Appliance, Incorporated
2006-05-30 01:52:59 +00:00
cel
4ec879514b Refactor the NFS over UDP retransmit timeout estimation logic to allow
the estimator to be more easily tuned and maintained.

There should be no functional change except there is now a lower limit
on the retransmit timeout to prevent the client from retransmitting
faster than the server's disks can fill requests, and an upper limit
to prevent the estimator from taking too long to retransmit during a
server outage.

Reviewed by:	mohan, kris, silby
Sponsored by:	Network Appliance, Incorporated
2006-05-30 00:43:07 +00:00
delphij
dfb738e5a6 MFC src/sys/nfsclient/nfs_bio.c,v 1.154
and src/sys/nfsclient/nfs_vnops.c,v 1.262 (by ps@):

 - Always return success from NFS strategy. nfs_doio(), in the
   event of an error, does the right thing, in terms of setting
   the error flags in the buf header. That fixes a crash from
   bstrategy().
 - Treat ETIMEDOUT as a "recoverable" error, causing the buffer
   to be re-dirtied. ETIMEDOUT can occur on soft mounts, when
   the number of retries are exceeded, and we don't want data loss
   in that case.

Submitted by:   Mohan Srinivasan
Approved by:	re (scottl)
2006-04-18 05:31:58 +00:00
jon
73f7f6d707 MFC 1.261 - fix a crash when an nfsv2 mount fails
Approved by:	re
2006-04-18 05:18:47 +00:00
cel
8c36fd3864 If an NFS server returns more than a few EJUKEBOX errors for a given RPC
request, the FreeBSD NFS client will quickly back off to a excessively
long wait (days, then weeks) before retrying the request.

Change the behavior of the FreeBSD NFS client to match the behavior of
the reference NFS client implementation (Solaris).  This provides a fixed
delay of 10 seconds between each retry by default.  A sysctl, called
nfs3_jukebox_delay, is now available to tune the delay.  Unlike Solaris,
the sysctl value on FreeBSD is in seconds, rather than in HZ.

MFC revision 1.136 to RELENG_6

Sponsored by:   Network Appliance, Incorporated
Reviewed by:    rick
Approved by:    re (kensmith), silby
2006-04-02 04:11:23 +00:00
kris
c506d5366a MFC r1.137:
Fix a bug in the NFS/TCP retransmission path.

The bug was that earlier, if a request was retransmitted,
we would do subsequent retransmits every 10 msecs.

This can cause data corruption under moderate loads by reordering
operations as seen by the client NFS attribute cache, and on the
server side when the retransmission occurs after the original request
has left the duplicate cache, since the operation will be committed
for a second time.

Further work on retransmission handling is needed (e.g. they are still
being done sent too often since they are scaled by HZ, and the size of
the dup cache is too small and easily overwhelmed on busy servers).

Submitted by:   mohans
Approved by:	re (mux)
2006-03-31 07:13:09 +00:00
cel
cfec640a25 Fix a bug in NFSv3 READDIRPLUS reply processing
The client's READDIRPLUS logic skips the attributes and
filehandle of the ".." entry.  If the server doesn't send
attributes but does send a filehandle for "..", the
client's logic doesn't account for the extra "value
Fix a bug in NFSv3 READDIRPLUS reply processing

The client's READDIRPLUS logic skips the attributes and
filehandle of the ".." entry.  If the server doesn't send
attributes but does send a filehandle for "..", the
client's logic doesn't account for the extra "value
follows" field that indicates whether the filehandle is
present, causing the remaining entries in the reply
to be ignored.

This is an MFC of 1.264 in the CURRENT branch.

Sponsored by:   Network Appliance, Inc.
Reviewed by:    rick, mohans
Approved by:    re, silby
2006-03-29 18:11:32 +00:00
delphij
6b5f6d40b5 MFC 1.263: a typo fix (diff reduction against -HEAD)
Approved by:	re (hrs)
2006-03-24 04:48:42 +00:00
pjd
84853dde8e MFC: sys/nfsclient/nfs_diskless.c 1.15
I wanted 'nolockd' here instead of 'lockd'.

Approved by:	re (mux)
2006-03-20 15:45:14 +00:00
scottl
c5719df4a5 MFC: Call vfs_destroy_object() before v_data gets set to NULL.
Approved by: re
2006-03-12 21:50:02 +00:00
pjd
8d7bed0cec MFC: sys/nfsclient/nfs_diskless.c 1.12,1.13
Add boot.nfsroot.options loader tunable.
It allows to specify options for NFS root file system.
Currently supported options are: soft, intr, conn, lockd.

I'm adding this functionality mostly for 'lockd' option, which is only
honored when performing the initial mount and will be silently ignored
if used while updating the mount options.

This will allow to use flock(2) without the need of using varmfs or
rpc.lockd and friends.

Example of use:
boot.nfsroot.options="intr,lockd"

Approved by:	re (scottl)
2006-03-01 18:01:28 +00:00
yar
dbcb706f58 Work around the shortness of the size argument to
vnode_create_vobject() while preserving the binary ABI
to filesystem modules in RELENG_6: introduce a new function
vnode_create_vobject_off() that takes the size argument
as off_t; move all stock file systems to it; re-implement
the old vnode_create_vobject() using vnode_create_vobject_off()
so that old or binary-only FS modules can work w/o hitting the
bug.  The trick is to pass a size of 0 to vnode_create_vobject_off()
so that it will call VOP_GETATTR() and thus get the actual,
untruncated file size even if the calling module still uses
the old vnode_create_vobject().

PR:		kern/92243
Approved by:	re (scottl)
2006-02-20 00:53:15 +00:00
rees
24c9cc5118 MFC rev 1.135:
Don't log an error on tcp connection reset, even if we don't get ECONNRESET.

Submitted by:	cel@citi.umich.edu
Approved by:	re (scottl)
2006-02-16 02:39:52 +00:00
rwatson
f017c618f0 Merge nfs_lock.c:1.43 from HEAD to RELENG_6:
In nfs_dolock(), GC now under-used ioflg, rendered obsolete when we moved
  from using a fifo to talk to rpc.lockd to using a special device node.

Approved by:	re (scottl)
2006-02-14 00:06:32 +00:00
tegge
81ceadf72a MFC: Add marker vnodes to ensure that all vnodes associated with the mount
point are iterated over when using MNT_VNODE_FOREACH.
2006-01-14 01:18:03 +00:00
maxim
b08fccbd06 MFC rev. 1.134: fix for a bug where NFS/TCP would
not reconnect (in the case where the server FIN'ed).

PR:		kern/88833
Requested by:	Roman V. Palagin
Approved by:	Mohan Strinivasan
2005-12-15 18:10:37 +00:00
rees
94b8aef59a MFC: nfs_socket.c 1.132, nfs_subs.c 1.142, nfsm_subs.h 1.37
fix a problem with XID re-use when a server returns NFSERR_JUKEBOX.
2005-12-13 21:29:26 +00:00
delphij
88a8009c9a MFC 1.260 (by ps): Fixed a panic that can happen when nfs_lookup() hits
an error.

RELENG_6_0 errata candidate.
2005-11-25 13:27:22 +00:00
glebius
42def59c5e MFC:
- Fix leak of struct nlminfo on process exit.
  - Fix malloc type collision, that made the above problem
    difficult to understand.

  Reported by:	Vladimir Sharun <sharun ukr.net>

Approved by:	re (scottl)
2005-10-27 18:35:19 +00:00
delphij
31cd49528b MFC (by ps):
| Fixes for NFS crashes on architectures that require strict alignment.
| - Fix nfsm_disct() so that after pulling up data, the remaining data
|   is aligned if necessary.
| - Fix nfs_clnt_tcp_soupcall() to bcopy() the rpc length out of the
|   mbuf (instead of casting m_data to a uint32).
|
| Submitted by:   Pyun YongHyeon
| Reviewed by:    Mohan Srinivasan
|
| Revision  Changes    Path
| 1.118     +12 -3     src/sys/nfs/nfs_common.c
| 1.38      +6 -0      src/sys/nfs/nfs_common.h
| 1.126     +2 -1      src/sys/nfsclient/nfs_socket.c

Approved by:	re (scottl)
2005-10-09 03:21:56 +00:00
delphij
373205352c MFC (by ps)
| In nfs_nget() if two threads race on the same filehandle, the loser
| should cause the nfsnode to get freed. This fixes a potential vnode
| (and nfsnode) leak in that path.
|
| Submitted by:   Mohan Srinivasan
| Reviewed by:    phk
|
| Revision  Changes    Path
| 1.78      +2 -1      src/sys/nfsclient/nfs_node.c

Approved by:	re (scottl)
2005-10-09 03:15:36 +00:00
rwatson
5db6e492ee Merge subr_prof.c:1.119, 1.120, 1.121, nfs_socket.c:1.130,
rpcclnt.c:1.14 from HEAD to RELENG_6:

Acquire Giant in uprintf() and tprintf() due to the non-MPSAFEty of
the tty code invoked from these functions.  In two cases, during
timeout handling in NFS-related RPC client code, acquire Giant in
the caller before other mutexes the caller might hold, in order to
avoid lock order reversals with Giant (a recursive acquire is not
a reversal as it won't ever wait).

Correct age-old comments about uprintf()/tprintf() sleeping: they
will never sleep.

Much useful feedback from:	bde
Approved by:			re (scottl)
2005-09-29 18:40:36 +00:00
ps
21b8a6b2f6 MFC: rev 1.127
Fix for a NFS soft mounts bug where if the number of retries exceeds
the max rexmits, the request was not being bounced back with a
ETIMEDOUT error.

Approved by:	re
2005-07-21 16:19:02 +00:00
green
b4b5044eed Ifdef out the incomplete non-blocking IO implementation for NFS
pending discussion of how implementation would proceed.  Applications
like -lc_r expect select(3) to match the EAGAIN-status of IO
functions.

Approved by:	re
2005-06-16 15:43:17 +00:00
green
ff904ffb64 Fix a serious deadlock with the NFS client. Given a large enough
atomic write request, it can fill the buffer cache with the entirety
of that write in order to handle retries.  However, it never drops
the vnode lock, or else it wouldn't be atomic, so it ends up waiting
indefinitely for more buf memory that cannot be gotten as it has it
all, and it waits in an uncancellable state.

To fix this, hibufspace is exported and scaled to a reasonable
fraction.  This is used as the limit of how much of an atomic write
request by the NFS client will be handled asynchronously.  If the
request is larger than this, it will be turned into a synchronous
request which won't deadlock the system.  It's possible this value is
far off from what is required by some, so it shall be tunable as soon
as mount_nfs(8) learns of the new field.

The slowdown between an asynchronous and a synchronous write on NFS
appears to be on the order of 2x-4x.

General nod by:	gad
MFC after:	2 weeks
More testing:	wes
PR:		kern/79208
2005-06-10 23:50:41 +00:00
des
0bbbcadeb1 Ugh. Previous commit got the logic exactly backward.
Submitted by:	bland
Pointy hat to:	des
2005-05-17 18:23:03 +00:00
des
d3a9750001 Revision 1.173 broke updating a mount from ro to rw. Fix that by clearing
the MNT_RDONLY flag if MNT_UPDATE is set and "ro" was not specified.

Suggested by:	cognet
2005-05-17 12:00:43 +00:00
rees
59c5573379 set R_MUSTRESEND flag in mark_for_reconnect so re-connected requests get
re-sent instead of timing out.

don't log an error message on reconnection, which is not an error.

remove unused nfs_mrep_before_tsleep.

Reviewed by:	Mohan Srinivasan
Approved by:	alfred
2005-05-10 14:25:14 +00:00
ps
40a0d434da Fix a bug in NFS/TCP where retransmissions would not reliably happen
if the server rebooted or tore down the connection for any reason.

Found by:	Jonathan Noack.
Submitted by:	Mohan Srinivasan.
2005-05-04 16:37:31 +00:00
iedowse
2593dad93c Don't copy the NFSMNT_* flags into struct statfs's f_flags field,
as they have no connection with the expected MNT_* flags. This bug
was exposed 18 months ago when the assignments to f_flags in
vfs_syscalls.c were moved to before the VFS_STATFS() call. It was
fixed in the CSRG source 10 years ago, but we never picked up that
change.

PR:		kern/80390
MFC after:	1 week
2005-05-02 15:57:10 +00:00
des
5e15cfc3fa When NFS was converted to the new mount syscall, code was written that sets
the MNT_RDONLY flag if the "ro" option was passed in from userland, and
clears it otherwise.  In the diskless case, the MNT_RDONLY flag is already
set when this code is reached, but there are no mount options, so it was
incorrectly cleared.  Change the logic so the MNT_RDONLY flag is set if the
"ro" option was specified, and left alone otherwise.

Note that the NFS code will still happily let you mount a filesystem RW
even if the server exports it RO.  I'm not sure how to fix that.
2005-04-27 14:46:02 +00:00
des
de2d951ab7 While I'm here, list the new kenv (boot.netif.name) along with the others. 2005-04-26 20:47:59 +00:00
des
37881dde0f When netbooting, as soon as we've figured out which interface we booted
from, store its name in a kenv variable.
2005-04-26 20:45:29 +00:00
rees
3e9035accc TCP reconnect is not an error.
Change the message from LOG_ERR to LOG_INFO.

Approved by:	alfred
2005-04-18 13:42:13 +00:00