is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.
occur due to np->n_size potentially changing if nfs_getcacheblk()
blocks in nfs_write().
Second, under -current we must supply the proper bufsize when obtaining
buffers that straddle the EOF, but due to the fact that np->n_size can
change out from under us it is possible that we may specify the wrong
buffer size and wind up truncating dirty data written by another
process.
Both problems are solved by implementing nfs_rslock(), which allows us
to lock around sensitive buffer cache operations such as those that
occur when appending to a file.
It is believed that this race is responsible for causing dirtyoff/dirtyend
and (in stable) validoff/validend to exceed the buffer size. Therefore
we have now added a warning printf for the dirtyoff/end case in current.
However, we have introduced a new problem which we need to fix at some
point, and that is that soft or intr NFS mounts may become
uninterruptable from the point of view of process A which is stuck waiting
on rslock while process B is stuck doing the rpc. To unstick process A,
process B would have to be interrupted first.
Reviewed by: Alfred Perlstein <bright@wintelcom.net>
cache. If the cached result lets us say "yes", then go with that. If
we're not sure, or we think the answer might be "no", go to the wire to be
certain. This avoids all of the possible false veto cases, and allows us
to key the cached value with just the UID for which the cached value holds,
reducing the bloat of the nfsnode structure from 104 bytes to just 12 bytes.
Since the "yes" case is by far the most common, this should still provide
a substantial performance improvement. Also default the cache to on, with
a conservative timeout (2 seconds). This improves performance if NFS is
loaded as a KLD module, as there's not (yet) code to parse an option out
of the module arguments to set it, and sysctl doesn't work (yet) for OIDs
in modules.
The 'accelerator' mode was suggested by Bjoern Groenvall (bg@sics.se)
Feedback on this would be appreciated as testing has been necessarily
limited by Comdex, and it would be valuable to have this in 2.2.8.
This yields startling performance increases for NFS clients for many
access profiles, due to the fact that ACCESS results are persistently
cached in the namecache in many cases.
Note that the code is somewhat conservative in that it requires an
exact credential match for a cache hit. This bloats the nfsnode
structure by sizeof(struct ucred) (96 bytes). Any less conservative
approach opens the possibility for a false veto in eg. setuid
applications. Alternative suggestions would be welcomed.
The cache is normally disabled, to activate set the sysctl variable
vfs.nfs.access_cache_timeout to a nonzero value. This is the time in
seconds that a cached entry will be considered valid; useful values appear
to be 2-10 seconds. Performance of the cache can be monitored with the
vfs.nfs.access_cache_hits and vfs.nfs.access_cache_hits variables.
1. Add new file "sys/kern/vfs_default.c" where default actions for
VOPs go. Implement proper defaults for ABORTOP, BWRITE, LEASE,
POLL, REVOKE and STRATEGY. Various stuff spread over the entire
tree belongs here.
2. Change VOP_BLKATOFF to a normal function in cd9660.
3. Kill VOP_BLKATOFF, VOP_TRUNCATE, VOP_VFREE, VOP_VALLOC. These
are private interface functions between UFS and the underlying
storage manager layer (FFS/LFS/MFS/EXT2FS). The functions now
live in struct ufsmount instead.
4. Remove a kludge of VOP_ functions in all filesystems, that did
nothing but obscure the simplicity and break the expandability.
If a filesystem doesn't implement VOP_FOO, it shouldn't have an
entry for it in its vnops table. The system will try to DTRT
if it is not implemented. There are still some cruft left, but
the bulk of it is done.
5. Fix another VCALL in vfs_cache.c (thanks Bruce!)
and b_validend. The changes to vfs_bio.c are a bit ugly but hopefully
can be tidied up later by a slight redesign.
PR: kern/2573, kern/2754, kern/3046 (possibly)
Reviewed by: dyson
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.
The system boots and can mount UFS filesystems.
Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
Mount_std mounts will not work until the getfsent
library routine is changed.
Reviewed by: various people
Submitted by: Jeffery Hsu <hsu@freebsd.org>
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
existing mechanism uses a global queue for some buffers and the
vp->b_dirtyblkhd queue for others. This turns sequential writes into
randomly ordered writes to the server, affecting both read and write
performance. The existing mechanism also copes badly with hung
servers, tending to block accesses to other servers when all the iods
are waiting for a hung server.
The new mechanism uses a queue for each mount point. All asynchronous
i/o goes through this queue which preserves the ordering of requests.
A simple mechanism ensures that the iods are shared out fairly between
active mount points. This removes the sysctl variable vfs.nfs.dwrite
since the new queueing mechanism removes the old delayed write code
completely.
This should go into the 2.2 branch.
it 1138 times (:-() in casts and a few more times in declarations.
This change is null for the i386.
The type has to be `typedef int vop_t(void *)' and not `typedef
int vop_t()' because `gcc -Wstrict-prototypes' warns about the
latter. Since vnode op functions are called with args of different
(struct pointer) types, neither of these function types is any use
for type checking of the arg, so it would be preferable not to use
the complete function type, especially since using the complete
type requires adding 1138 casts to avoid compiler warnings and
another 40+ casts to reverse the function pointer conversions before
calling the functions.
These functions went away:
enosys (hasn't been used for some time)
enxio
enodev
enoioctl (was used only once, actually for a vop)
if_tun.c:
Continued cleaning up...
conf.h:
Probably fixed the type of d_reset_t. It is hard to tell the correct
type because there are no non-dummy device reset functions.
Removed last vestige of ambiguous sleep message strings.
The version 2 support has been tested (client+server) against FreeBSD-2.0,
IRIX 5.3 and FreeBSD-current (using a loopback mount). The version 2 support
is stable AFAIK.
The version 3 support has been tested with a loopback mount and minimally
against an IRIX 5.3 server. It needs more testing and may have problems.
I have patched amd to support the new variable length filehandles although
it will still only use version 2 of the protocol.
Before booting a kernel with these changes, nfs clients will need to at least
build and install /usr/sbin/mount_nfs. Servers will need to build and
install /usr/sbin/mountd.
NFS diskless support is untested.
Obtained from: Rick Macklem <rick@snowhite.cis.uoguelph.ca>
- Make a number of filesystems work again when they are statically compiled
(blush)
- FIFOs are no longer optional; ``options FIFO'' removed from distributed
config files.
use it in NFS. This is required both for diskless support and for POSIX
compliance. Note: the support in NFS is only for the local node.
Submitted by: based on work originally done by Yuval Yurom