Commit Graph

204 Commits

Author SHA1 Message Date
Rick Macklem
dd5b5a9431 Patch the experimental NFS client so that there is a timeout
for negative name cache entries in a manner analogous to
r202767 for the regular NFS client. Also, make the code in
nfs_lookup() compatible with that of the regular client
and replace the sysctl variable that enabled negative name
caching with the mount point option.

MFC after:	2 weeks
2010-01-31 19:12:24 +00:00
Rick Macklem
80169e41d6 Patch the experimental NFS client in a manner analogous to
r203072 for the regular NFS client. Also, delete two fields
of struct nfsmount that are not used by the FreeBSD port of
the client.

MFC after:	2 weeks
2010-01-28 16:17:24 +00:00
Rick Macklem
b04b4acda0 Fix three related problems in the experimental nfs client when
checking for conflicts w.r.t. byte range locks for NFSv4.
1 - Return 0 instead of EACCES when a conflict is found, for F_GETLK.
2 - Check for "same file" when checking for a conflict.
3 - Don't check for a conflict for the F_UNLCK case.
2010-01-03 18:27:10 +00:00
Rick Macklem
5d60668c70 Fix the experimental NFS client so that it can create Unix
domain sockets on an NFSv4 mount point. It was generating
incorrect XDR in the request for this case.

Tested by:	infofarmer
MFC after:	2 weeks
2009-12-31 18:02:48 +00:00
Rick Macklem
4a8e21764d When porting the experimental nfs subsystem to the FreeBSD8 krpc,
I added 3 functions that were already in the experimental client
under different names. This patch deletes the functions in the
experimental client and renames the calls to use the other set.
(This is just removal of duplicated code and does not fix any bug.)

MFC after:	2 weeks
2009-12-26 19:15:15 +00:00
Edward Tomasz Napierala
74991298d9 Remove unneeded ifdefs.
Reviewed by:	rmacklem
2009-12-03 18:03:42 +00:00
Jaakko Heinonen
1bb015c07c Create verifier used by FreeBSD NFS client is suboptimal because the
first part of a verifier is set to the first IP address from
V_in_ifaddrhead list. This address is typically the loopback address
making the first part of the verifier practically non-unique. The second
part of the verifier is initialized to zero making its initial value
non-unique too.

This commit changes the strategy for create verifier initialization:
just initialize it to a random value. Also move verifier handling into
its own function and use a mutex to protect the variable.

This change is a candidate for porting to sys/nfsclient.

Reviewed by:	jhb, rmacklem
Approved by:	trasz (mentor)
2009-11-11 15:43:07 +00:00
Jaakko Heinonen
dd697cbf66 Unloading of the nfscl module is unsupported because newnfslock doesn't
support unloading. It's not trivial to implement newnfslock unloading so
for now just admit that unloading is unsupported and refuse to attempt
unload in all nfscl module event handlers.

Reviewed by:	rmacklem
Approved by:	trasz (mentor)
2009-10-20 15:06:18 +00:00
Jaakko Heinonen
349af4de7d Fix ordering of nfscl_modevent() and ncl_uninit(). nfscl_modevent() must
be called after ncl_uninit() when unloading the nfscl module because
ncl_uninit() uses ncl_iod_mutex which is destroyed in nfscl_modevent().

Reviewed by:	rmacklem
Approved by:	trasz (mentor)
2009-10-20 15:01:46 +00:00
Jaakko Heinonen
417612e0e4 Fix comment typos.
Reviewed by:	rmacklem
Approved by:	trasz (mentor)
2009-10-20 14:57:26 +00:00
Rick Macklem
8f63187ec1 Add LK_NOWITNESS to the vn_lock() calls done on newly created nfs
vnodes, since these nodes are not linked into the mount queue and,
as such, the vn_lock() cannot cause a deadlock so LORs are harmless.

Suggested by:	kib
Approved by:	kib (mentor)
MFC after:	3 days
2009-09-09 20:37:49 +00:00
Marko Zec
0348c661d1 Fix NFS panics with options VIMAGE kernels by apropriately setting curvnet
context inside the RPC code.

Temporarily set td's cred to mount's cred before calling socreate() via
__rpc_nconf2socket().

Submitted by:	rmacklem (in part)
Reviewed by:	rmacklem, rwatson
Discussed with:	dfr, bz
Approved by:	re (rwatson), julian (mentor)
MFC after:	3 days
2009-08-24 10:09:30 +00:00
Rick Macklem
60f04e38ca Apply the same patch as r196205 for nfs_upgrade_lock() and
nfs_downgrade_lock() to the experimental nfs client.

Approved by:	re (kensmith), kib (mentor)
2009-08-17 16:12:28 +00:00
Rick Macklem
6a795918c1 Fix the experimental nfs client so that it only calls ncl_vinvalbuf()
for NFSv2 and not NFSv4 when nfscl_mustflush() returns 0. Since
nfscl_mustflush() only returns 0 when there is a valid write delegation
issued to the client, it only affects the case of an NFSv4 mount with
callbacks/delegations enabled.

Approved by:	 re (kensmith), kib (mentor)
2009-07-29 14:50:31 +00:00
Rick Macklem
93a15e2054 When vfs.newnfs.callback_addr is set to an IPv4 address, the
experimental NFSv4 client might try and use it as an IPv6 address,
breaking callbacks. The fix simply initializes the isinet6 variable
for this case.

Approved by:	re (kensmith), kib (mentor)
2009-07-22 18:10:44 +00:00
Rick Macklem
c79e697621 Add changes to the experimental nfs client to use the PBDRY flag for
msleep(9) when a vnode lock or similar may be held. The changes are
just a clone of the changes applied to the regular nfs client by
r195703.

Approved by:	re (kensmith), kib (mentor)
2009-07-22 14:37:53 +00:00
Rick Macklem
7f1968ba10 When using an NFSv4 mount in the experimental nfs client with delegations
being issued from the server, there was a case where an Open issued locally
based on the delegation would be released before the associated vnode
became inactive. If the delegation was recalled after the open was released,
an Open against the server would not have been acquired and subsequent I/O
operations would need to use the special stateid of all zeros. This patch
fixes that case.

Approved by:	re (kensmith), kib (mentor)
2009-07-22 14:32:28 +00:00
Rick Macklem
80e556956e Fix two bugs in the experimental nfs client:
- When the root vnode was acquired during mounting, mnt_stat.f_iosize was
  still set to 0, so getnewvnode() would set bo_bsize == 0. This would
  confuse getblk(), so that it always returned the first block causing
  the problem when the root directory of the mount point was greater
  than one block in size. It was fixed by setting mnt_stat.f_iosize to
  NFS_DIRBLKSIZ before calling ncl_nget() to acquire the root vnode.
- NFSMNT_INT was being set temporarily while the initial connect to a
  server was being done. This erroneously configured the krpc for
  interruptible RPCs, which caused problems because signals weren't
  being masked off as they would have been for interruptible mounts.
  This code was deleted to fix the problem. Since mount_nfs does an
  NFS null RPC before the mount system call, connections to the server
  should work ok.

Tested by:	swell dot k at gmail dot com
Approved by:	re (kensmith), kib (mentor)
2009-07-19 16:44:26 +00:00
Rick Macklem
405229f913 Fix the experimental nfs client so that it does not cause a
"share->excl" panic when doing a lookup of dotdot at the root
of a server's file system. The patch avoids calling vn_lock()
for that case, since nfscl_nget() has already acquired a lock
for the vnode.

Approved by:	re (kensmith), kib (mentor)
2009-07-14 23:10:23 +00:00
Robert Watson
eddfbb763d Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator.  Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...).  This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack.  Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory.  Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy.  Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address.  When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by:  bz
Reviewed by:            bz, zec
Discussed with:         gnn, jamie, jeff, jhb, julian, sam
Suggested by:           peter
Approved by:            re (kensmith)
2009-07-14 22:48:30 +00:00
Rick Macklem
ad86aef9af Fix the handling of dotdot in lookup for the experimental nfs client
in a manner analagous to the change in r195294 for the regular nfs client.

Approved by:	re (kensmith), kib (mentor)
2009-07-12 17:02:17 +00:00
Rick Macklem
9ca27b565b Since the nfscl_getclose() function both decremented open counts and,
optionally, created a separate list of NFSv4 opens to be closed, it
was possible for the associated OpenOwner to be free'd before the Open
was closed. The problem was that the Open was taken off the OpenOwner
list before the Close RPC was done and OpenOwners can be free'd once the
list is empty. This patch separates out the case of doing the Close RPC
into a separate function called nfscl_doclose() and simplifies nfsrpc_doclose()
so that it closes a single open instead of a list of them. This avoids
removing the Open from the OpenOwner list before doing the Close RPC.

Approved by:	re (kensmith), kib (mentor)
2009-07-09 19:00:29 +00:00
Robert Watson
2d9cfabad4 Add a new global rwlock, in_ifaddr_lock, which will synchronize use of the
in_ifaddrhead and INADDR_HASH address lists.

Previously, these lists were used unsynchronized as they were effectively
never changed in steady state, but we've seen increasing reports of
writer-writer races on very busy VPN servers as core count has gone up
(and similar configurations where address lists change frequently and
concurrently).

For the time being, use rwlocks rather than rmlocks in order to take
advantage of their better lock debugging support.  As a result, we don't
enable ip_input()'s read-locking of INADDR_HASH until an rmlock conversion
is complete and a performance analysis has been done.  This means that one
class of reader-writer races still exists.

MFC after:      6 weeks
Reviewed by:    bz
2009-06-25 11:52:33 +00:00
Rick Macklem
65cc6600c5 Replace RPCAUTH_UNIXGIDS with NFS_MAXGRPS so that nfscbd.c will build.
Approved by:	kib (mentor)
2009-06-20 17:11:07 +00:00
Rick Macklem
2c1e6cce5b Change the size of the nfsc_groups[] array in the experimental nfs
client to RPCAUTH_UNIXGIDS + 1 (17), since that is what can go on
the wire for AUTH_SYS authentication.

Reviewed by:	brooks
Approved by:	kib (mentor)
2009-06-20 00:54:57 +00:00
Brooks Davis
838d985825 Rework the credential code to support larger values of NGROUPS and
NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024
and 1023 respectively.  (Previously they were equal, but under a close
reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it
is the number of supplemental groups, not total number of groups.)

The bulk of the change consists of converting the struct ucred member
cr_groups from a static array to a pointer.  Do the equivalent in
kinfo_proc.

Introduce new interfaces crcopysafe() and crsetgroups() for duplicating
a process credential before modifying it and for setting group lists
respectively.  Both interfaces take care for the details of allocating
groups array. crsetgroups() takes care of truncating the group list
to the current maximum (NGROUPS) if necessary.  In the future,
crsetgroups() may be responsible for insuring invariants such as sorting
the supplemental groups to allow groupmember() to be implemented as a
binary search.

Because we can not change struct xucred without breaking application
ABIs, we leave it alone and introduce a new XU_NGROUPS value which is
always 16 and is to be used or NGRPS as appropriate for things such as
NFS which need to use no more than 16 groups.  When feasible, truncate
the group list rather than generating an error.

Minor changes:
  - Reduce the number of hand rolled versions of groupmember().
  - Do not assign to both cr_gid and cr_groups[0].
  - Modify ipfw to cache ucreds instead of part of their contents since
    they are immutable once referenced by more than one entity.

Submitted by:	Isilon Systems (initial implementation)
X-MFC after:	never
PR:		bin/113398 kern/133867
2009-06-19 17:10:35 +00:00
Alan Cox
57a7e73261 Fix some of the style errors in *getpages(). 2009-06-18 05:56:24 +00:00
Rick Macklem
76b30a0cd4 Add the SVC_RELEASE(xprt), as required by r194407.
Approved by:	kib (mentor)
2009-06-17 22:55:59 +00:00
Bjoern A. Zeeb
ebd8672cc3 Add explicit includes for jail.h to the files that need them and
remove the "hidden" one from vimage.h.
2009-06-17 15:01:01 +00:00
Rick Macklem
81e3c4fc8e Fix handling of ".." in nfs_lookup() for the forced dismount case
by cribbing the change made to the regular nfs client in r194358.

Approved by:	kib (mentor)
2009-06-17 14:10:18 +00:00
Jamie Gritton
c1f192193d Rename the host-related prison fields to be the same as the host.*
parameters they represent, and the variables they replaced, instead of
abbreviated versions of them.

Approved by:	bz (mentor)
2009-06-13 15:39:12 +00:00
Jamie Gritton
01de879ac7 Use getcredhostuuid instead of accessing the prison directly.
Approved by:	bz (mentor)
2009-06-13 15:35:22 +00:00
Rick Macklem
934a309971 This commit is analagous to r193952, but for the experimental nfs
subsystem. Add a test for VI_DOOMED just after ncl_upgrade_vnlock() in
ncl_bioread_check_cons(). This is required since it is possible
for the vnode to be vgonel()'d while in ncl_upgrade_vnlock() when
a forced dismount is in progress. Also, move the check for VI_DOOMED
in ncl_vinvalbuf() down to after ncl_upgrade_vnlock() and replace the
out of date comment for it.

Approved by:	kib (mentor)
2009-06-10 21:16:39 +00:00
Rick Macklem
410654ec1b Since vn_lock() with the LK_RETRY flag never returns an error
for FreeBSD-CURRENT, the code that checked for and returned the
error was broken. Change it to check for VI_DOOMED set after
vn_lock() and return an error for that case. I believe this
should only happen for forced dismounts.

Approved by:	kib (mentor)
2009-06-09 15:18:01 +00:00
Rick Macklem
2fa32154e4 Fix nfscl_getcl() so that it doesn't crash when it is called to
do an NFSv4 Close operation with the cred argument NULL. Also,
clarify what NULL arguments mean in the function's comment.

Approved by:	kib (mentor)
2009-06-08 18:41:23 +00:00
Alan Cox
1f17689408 nfs_write() can use the recently introduced vfs_bio_set_valid() instead of
vfs_bio_set_validclean(), thereby avoiding the page queues lock.

Garbage collect vfs_bio_set_validclean().  Nothing uses it any longer.
2009-05-31 20:18:02 +00:00
Marko Zec
705fe7ce35 Unbreak options VIMAGE kernel builds.
Approved by:	julian (mentor)
2009-05-31 11:57:51 +00:00
Rick Macklem
d00a615a98 Add a check to v_type == VREG for the recently modified code that
does NFSv4 Closes in the experimental client's VOP_INACTIVE().
I also replaced a bunch of ap->a_vp with a local copy of vp,
because I thought that made it more readable.

Approved by:	kib (mentor)
2009-05-30 22:11:12 +00:00
Jamie Gritton
76ca6f88da Place hostnames and similar information fully under the prison system.
The system hostname is now stored in prison0, and the global variable
"hostname" has been removed, as has the hostname_mtx mutex.  Jails may
have their own host information, or they may inherit it from the
parent/system.  The proper way to read the hostname is via
getcredhostname(), which will copy either the hostname associated with
the passed cred, or the system hostname if you pass NULL.  The system
hostname can still be accessed directly (and without locking) at
prison0.pr_host, but that should be avoided where possible.

The "similar information" referred to is domainname, hostid, and
hostuuid, which have also become prison parameters and had their
associated global variables removed.

Approved by:	bz (mentor)
2009-05-29 21:27:12 +00:00
Alan Cox
3933ec4d15 Make *getpages()s' assertion on the state of each page's dirty bits
stricter.
2009-05-28 18:11:09 +00:00
Rick Macklem
fdd88deeba Fix handling of NFSv4 Close operations in ncl_inactive(). Only
do them for NFSv4 and flush writes to the server before doing
the Close(s), as required. Also, use the a_td argument instead of
curthread.

Approved by:	kib (mentor)
2009-05-27 19:41:29 +00:00
Rick Macklem
63bde62e9a Fix the experimental nfsv4 client so that it works for the
case of a kerberized mount without a host based principal
name. This will only work for mounts being done by a user
other than root. Support for a host based principal name
will not work until proposed changes to the rpcsec_gss part
of the krpc are committed. It now builds for "options KGSSAPI".

Approved by:	kib (mentor)
2009-05-24 03:22:49 +00:00
Rick Macklem
b66124f314 Change the sysctl_base argument to svcpool_create() to NULL for
client side callbacks so that leaf names are not re-used,
since they are already being used by the server.

Approved by:	kib (mentor)
2009-05-22 23:22:56 +00:00
Rick Macklem
9dd7554f09 Fix the name of the module common to the client and server
in the experimental nfs subsystem to the correct one for
the MODULE_DEPEND() macro.

Approved by:	kib (mentor)
2009-05-22 20:55:29 +00:00
Rick Macklem
682eec57c4 Modify the mount handling code in the experimental nfs client to
use the newer nmount() style arguments, as is used by mount_nfs.c.
This prepares the kernel code for the use of a mount_nfs.c with
changes for the experimental client integrated into it.

Approved by:	kib (mentor)
2009-05-22 15:08:12 +00:00
Rick Macklem
6ffb637d4a Change the code in the experimental nfs client to avoid flushing
writes upon close when a write delegation is held by the client.
This should be safe to do, now that nfsv4 Close operations are
delayed until ncl_inactive() is called for the vnode.

Approved by:	kib (mentor)
2009-05-22 15:01:47 +00:00
Rick Macklem
47a598563d Change the experimental NFSv4 client so that it does not do
the NFSv4 Close operations until ncl_inactive(). This is
necessary so that the Open StateIDs are available for doing
I/O on mmap'd files after VOP_CLOSE(). I also changed some
indentation for the nfscl_getclose() function.

Approved by:	kib (mentor)
2009-05-18 21:22:03 +00:00
Alan Cox
f2eb372e02 Merge r191964: Eliminate a case of unnecessary page queues locking. 2009-05-17 06:45:30 +00:00
Rick Macklem
72d1bbba52 Changed sys/fs/nfs_clbio.c in the same way Alan Cox changed
sys/nfsclient/nfs_bio.c for r192134, so that the sources stay
in sync.

Approved by:	kib (mentor)
2009-05-16 22:31:38 +00:00
Rick Macklem
d34b41c9c2 Modify the diskless booting code in sys/fs/nfsclient to be compatible
with what is in sys/nfsclient, so that it will at least build now.

Approved by:	kib (mentor)
2009-05-15 16:03:11 +00:00
Rick Macklem
98ad44534e Apply changes to the experimental nfs server so that it uses the security
flavors as exported in FreeBSD-CURRENT. This allows it to use a
slightly modified mountd.c instead of a different utility.

Approved by:	kib (mentor)
2009-05-14 21:39:08 +00:00
Rick Macklem
a770e183dd Apply a one line change to nfs_clbio.c (which is largely a copy
of sys/nfsclient/nfs_bio.c) to track the change recently committed
by acl for nfs_bio.c.

Approved by:	kib (mentor)
2009-05-13 21:18:34 +00:00
Attilio Rao
dfd233edd5 Remove the thread argument from the FSD (File-System Dependent) parts of
the VFS.  Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.

In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.

While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.

VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled.  Bump __FreeBSD_version in order to signal such
situation.
2009-05-11 15:33:26 +00:00
Rick Macklem
9ec7b004d0 Add the experimental nfs subtree to the kernel, that includes
support for NFSv4 as well as NFSv2 and 3.
	It lives in 3 subdirs under sys/fs:
	nfs - functions that are common to the client and server
	nfsclient - a mutation of sys/nfsclient that call generic functions
	to do RPCs and handle state. As such, it retains the
	buffer cache handling characteristics and vnode semantics that
	are found in sys/nfsclient, for the most part.
	nfsserver - the server. It includes a DRC designed specifically for
	NFSv4, that is used instead of the generic DRC in sys/rpc.
	The build glue will be checked in later, so at this point, it
	consists of 3 new subdirs that should not affect kernel building.

Approved by:	kib (mentor)
2009-05-04 15:23:58 +00:00