vfs.nfs.iodmaxidle (idle time before nfsiod's exit). Make it adaptive
so that we create nfsiod's on demand and they go away after not being
used for a while. The upper limit is NFS_MAXASYNCDAEMON (currently 20).
More will be done here, but this is a useful checkpoint.
Submitted by: Maxime Henrion <mux@qualys.com>
process of being unmounted. This allows forced NFS unmounts to
complete even if there are processes stuck holding the mnt_lock
while the server is down. The mechanism is not ideal in that there
is a small chance we might accidentally cancel requests during a
failed non-forced unmount attempt on that filesystem, but this
is not really a big problem.
Also, move the tsleep() in nfs_nmcancelreqs() so that we do not
sleep in the case where there are no requests to be cancelled.
down, even if there are hung processes and the mount is non-
interruptible.
This works by having nfs_unmount call a new function nfs_nmcancelreqs()
in the FORCECLOSE case. It scans the list of outstanding requests
and marks as interrupted any requests belonging to the specified
mount. Then it waits up to 30 seconds for all requests to terminate.
A few other changes are necessary to support this:
- Unconditionally set a socket timeout so that even hard mounts
are guaranteed to occasionally check the R_SOFTTERM flag on
requests. For hard mounts this flag can only be set by
nfs_nmcancelreqs().
- Reject requests on a mount that is currently being unmounted.
- Never grant the receive lock to a request that has been cancelled.
This should also avoid an old problem where a forced NFS unmount
could cause a crash; it occurred when a VOP on an unlocked vnode
(usually VOP_GETATTR) was in progress at the time of the forced
unmount.
socreate(), rather than getting it implicitly from the thread
argument.
o Make NFS cache the credential provided at mount-time, and use
the cached credential (nfsmount->nm_cred) when making calls to
socreate() on initially connecting, or reconnecting the socket.
This fixes bugs involving NFS over TCP and ipfw uid/gid rules, as well
as bugs involving NFS and mandatory access control implementations.
Reviewed by: freebsd-arch
1222 bytes (derived as the maximum that isc-dhcpd uses). This solves
the problem if a bootp/DHCP reply is over 256 bytes in which the
end of the bootp/DHCP reply will not be found and then the reply will
be ignored. This happens when swap and root paths are longish or many
parameters are set.
Reviewed by: imp
Approved by: imp
temporary storage. In the old NFS code it wasn't at all clear if
the value of `tl' was used across or after macro calls, but I'm
fairly confident that the convention was to keep its use local.
Each ex-macro function now uses a local version of this variable,
so all of the double-indirection goes away.
The only exception to the `local use' rule for `tl' is nfsm_clget(),
which is left unchanged by this commit.
Reviewed by: peter
commit by Kirk also fixed a softupdates bug that could easily be triggered
by server side NFS.
* An edge case with shared R+W mmap()'s and truncate whereby
the system would inappropriately clear the dirty bits on
still-dirty data. (applicable to all filesystems)
THIS FIX TEMPORARILY DISABLED PENDING FURTHER TESTING.
see vm/vm_page.c line 1641
* The straddle case for VM pages and buffer cache buffers when
truncating. (applicable to NFS client side)
* Possible SMP database corruption due to vm_pager_unmap_page()
not clearing the TLB for the other cpu's. (applicable to NFS
client side but could effect all filesystems). Note: not
considered serious since the corruption occurs beyond the file
EOF.
* When flusing a dirty buffer due to B_CACHE getting cleared,
we were accidently setting B_CACHE again (that is, bwrite() sets
B_CACHE), when we really want it to stay clear after the write
is complete. This resulted in a corrupt buffer. (applicable
to all filesystems but probably only triggered by NFS)
* We have to call vtruncbuf() when ftruncate()ing to remove
any buffer cache buffers. This is still tentitive, I may
be able to remove it due to the second bug fix. (applicable
to NFS client side)
* vnode_pager_setsize() race against nfs_vinvalbuf()... we have
to set n_size before calling nfs_vinvalbuf or the NFS code
may recursively vnode_pager_setsize() to the original value
before the truncate. This is what was causing the user mmap
bus faults in the nfs tester program. (applicable to NFS
client side)
* Fix to softupdates (see ufs/ffs/ffs_inode.c 1.73, commit made
by Kirk).
Testing program written by: Avadis Tevanian, Jr.
Testing program supplied by: jkh / Apple (see Dec2001 posting to freebsd-hackers with Subject 'NFS: How to make FreeBS fall on its face in one easy step')
MFC after: 1 week
reference: with td->td_ucred, it will be desirable to authorize
based on td->td_ucred, rather than p->p_ucred.
o Since the same variable 'p' was later used with pfind() on the target
process for the wakeup, introduce a new local variable 'targetp'
to use instead.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
to avoid the need for rpc.lockd to perform client locks. Using
this option a user can revert back to using local locks for NFS mounts
like we did before we had rpc.lockd.
one to perform a vn_open using temporary/other/fake credentials.
Modify the nfs client side locking code to use vn_open_cred() passing
proc0's ucred instead of the old way which was to temporary raise
privs while running vn_open(). This should close the race hopefully.
in wdrain during a write. This flag needs to be used in devices whos
strategy routines turn-around and issue another high level I/O, such as
when MD turns around and issues a VOP_WRITE to vnode backing store, in order
to avoid deadlocking the dirty buffer draining code.
Remove a vprintf() warning from MD when the backing vnode is found to be
in-use. The syncer of buf_daemon could be flushing the backing vnode at
the time of an MD operation so the warning is not correct.
MFC after: 1 week
struct ucred to userland. In 5.0-CURRENT, it is desirable to instead
export struct xucred, as ucred contains mutexes, pointers, and other
kernel evil. I'll add it to my work queue.
implementation, so that the information doesn't get lost.
(1) /var/run/lock is looked up relative to the current thread's root
directory, but it's not clear that's desirable.
(2) A race condition associated with live credential modification on
a shared credential is present when privilege is granted for
the purposes of talking to /var/run/lock.
- crhold() returns a reference to the ucred whose refcount it bumps.
- crcopy() now simply copies the credentials from one credential to
another and has no return value.
- a new crshared() primitive is added which returns true if a ucred's
refcount is > 1 and false (0) otherwise.
next to equivalent m_len adjustments. Move the nfsm_subs.h macros
into groups depending on which phase they are used in, since that
affects the error recovery requirements. Collect some of the common error
checking into a single macro as preparation for unwinding some more.
Have nfs_rephead return a value instead of secretly modifying args.
Remove some unused function arguments that were being passed around.
Clarify nfsm_reply()'s error handling (I hope).
will pass NULL as the struct proc when td is NULL. This has stopped
crashing on my machine.
Note: The passing of NULL may be bogus, but I'll let others fix that
problem.
Reviewed by: jhb
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
a temporary array to store struct buf pointers if the list doesn't
fit in a local array. Usually it frees the array when finished,
but if it jumps to the 'again' label and the new list does fit in
the local array then it can forget to free a previously malloc'd
M_TEMP memory.
Move the free() up a line so that it frees any previously allocated
memory whether or not it needs to malloc a new array.
Reviewed by: dillon
(this commit is just the first stage). Also add various GIANT_ macros to
formalize the removal of Giant, making it easy to test in a more piecemeal
fashion. These macros will allow us to test fine-grained locks to a degree
before removing Giant, and also after, and to remove Giant in a piecemeal
fashion via sysctl's on those subsystems which the authors believe can
operate without Giant.