experimental NFS server, to handle the case where an
exported file system is forced dismounted while an RPC
is in progress. Further commits will fix the cases where
a mount point is used when the associated vnode isn't locked.
Reviewed by: kib
MFC after: 2 weeks
for RPC operations when it can. Since VFS_FHTOVP() currently
always gets an exclusively locked vnode and is usually called
at the beginning of each RPC, the RPCs for a given vnode will
still be serialized. As such, passing a lock type argument to
VFS_FHTOVP() would be preferable to doing the vn_lock() with
LK_DOWNGRADE after the VFS_FHTOVP() call.
Reviewed by: kib
MFC after: 2 weeks
NFS server, so that it can avoid calling VOP_ISLOCKED()
when the vnode is known to be locked. This will allow
LK_SHARED to be used for these cases, which happen to
be all the cases that can use LK_SHARED. This does not
fix any bug, but it reduces the number of calls to
VOP_ISLOCKED() and prepares the code so that it can be
switched to using LK_SHARED in a future patch.
Reviewed by: kib
MFC after: 2 weeks
readdir functions. In particular, get rid of two bogus
VOP_ISLOCKED() calls. Removing the VOP_ISLOCKED() calls
is the only actual bug fixed by this patch.
Reviewed by: kib
MFC after: 2 weeks
increasing directory offset cookies, disable the UFS related
loop that skips over directory entries at the beginning of
the block for the experimental NFS server. This loop is
required for UFS since it always returns directory entries
starting at the beginning of the block that
the requested directory offset is in. In discussion with pjd@
and mckusick@ it seems that this behaviour of UFS should maybe
change, with this fix being an interim patch until then.
This patch only fixes the experimental server, since pjd@ is
working on a patch for the regular server.
Discussed with: pjd, mckusick
MFC after: 5 days
experimental NFSv4 server. The first was a bogus use of VOP_ISLOCKED()
in a KASSERT() and the second was the need to lock the vnode for the
nfsrv_checkremove() call. Also, delete a "__unused" that was bogus,
since the argument is used.
Reviewed by: zack.kirsch at isilon.com
MFC after: 2 weeks
experimental NFSv4 server to a NFSv4 client when delegations are not
being issued, even if the client advertises a callback path.
This avoids a problem where a Linux client advertises a
callback path that doesn't work, due to a firewall, and then
times out an Open attempt before the FreeBSD server gives up
its callback connection attempt. (Suggested by
drb at karlov.mff.cuni.cz to fix the Linux client problem that
he reported on the fs-stable mailing list.)
The server should probably have
a 1sec timeout on callback connection attempts when there are
no delegations issued to the client, but that patch will require
changes to the krpc and this serves as a work around until then.
Tested by: drb at karlov.mff.cuni.cz
MFC after: 5 days
to use the generic hash32_buf() function. Although adding the
bytes seemed sufficient for UFS and ZFS, since most of the bytes
are the same for file handles on the same volume, this might not
be sufficient for other file systems. Use of a generic function
also seems preferable to one specific to NFSv4.
Suggested by: gleb.kurtsou at gmail.com
MFC after: 10 days
server so that it will work better for non-UFS file systems.
The new function simply sums the bytes of the fh_fid field
of fhandle_t.
MFC after: 10 days
r214049 for the regular NFS server, so that it will not do
a VOP_LOOKUP() of ".." when at the root of a file system
when performing a ReaddirPlus RPC.
MFC after: 10 days
NFSv4 server more readable. Mostly changes to comments, but a
case of >= is changed to >, since == can never happen. Also, I've
added a couple of KASSERT()s and a slight optimization, since
once the "else if" case happens, subsequent locks in the list can't
have any effect. None of these changes fixes any known bug.
MFC after: 2 weeks
it frees local locks correctly upon close. In order for
nfsrv_localunlock() to work correctly, the lock can no longer be in
the lockowner's stateid list. As such, nfsrv_freenfslock() has to
be called before nfsrv_localunlock(), to get rid of the lock structure
on the lockowner's stateid list. This only affected operation when
local locks (vfs.newnfs.enable_locallocks=1) are enabled, which is
not the default at this time.
MFC after: 1 week
unlock operations correctly. It was passing in F_SETLK instead of
F_UNLCK as the operation for the unlock case. This only affected
operation when local locking (vfs.newnfs.enable_locallocks=1) was enabled.
MFC after: 1 week
zack.kirsch at isilon.com for a race between nfsrv_freeopen()
and nfsrv_getlockfile() in the experimental NFS server that
he found during testing. Although nfsrv_freeopen() holds a
sleep lock on the lock file structure when called with
cansleep != 0, nfsrv_getlockfile() could still search the
list, once it acquired the NFSLOCKSTATE() mutex. I believe
that acquiring the mutex in nfsrv_freeopen() fixes the race.
MFC after: 2 weeks
nfsd_recalldelegation() function, since this function is called
by nfsd threads when they are handling NFSv2 or NFSv3 RPCs, where
no reference count would have been acquired.
MFC after: 2 weeks
the correct mutex when checking nfsv4root_lock. Although this
could be fixed by adding mutex lock/unlock calls, zack.kirsch at
isilon.com suggested a better fix that uses a non-blocking
acquisition of a reference count on nfsv4root_lock. This fix
allows the weird NFSLOCKSTATE(); NFSUNLOCKSTATE(); synchronization
to be deleted. This patch applies this fix.
Tested by: zack.kirsch at isilon.com
MFC after: 2 weeks
count on nfsv4rootfs_lock when dumping state, since these functions
are not called by nfsd threads. Without this reference count, it
is possible for an nfsd thread to acquire an exclusive lock on
nfsv4rootfs_lock while the dump is in progress and then change the
lists, potentially causing a crash.
Reported by: zack.kirsch at isilon.com
MFC after: 2 weeks
released a reference count on nfsv4rootfs_lock erroneously when
administrative revocation of state was done.
Submitted by: zack.kirsch at isilon.com
MFC after: 2 weeks
server so that the modules will load when kernels are built with
none of the NFS* configuration options specified. I believe this
resolves the problems reported by PR kern/144458 and the email on
freebsd-stable@ posted by Dmitry Pryanishnikov on June 13.
Tested by: kib
PR: kern/144458
Reviewed by: kib
MFC after: 1 week
DIAGNOSTIC and #ifndef DIAGNOSTIC for debug assertions, prefer
KASSERT(). Also change one #ifdef DIAGNOSTIC in the new nfs server.
Submitted by: Mikolaj Golub <to.my.trociny gmail com>
MFC after: 2 weeks
during the grace period after startup. This grace period must
be at least the lease duration, which is typically 1-2 minutes.
It seems prudent for the experimental NFS client to wait a few
seconds before retrying such an RPC, so that the server isn't
flooded with non-recovery RPCs during recovery. This patch adds
an argument to nfs_catnap() to implement a 5 second delay
for this case.
MFC after: 1 week
checks on the length of the client's open/lock owner name. Also,
add free()'s for one case where they were missing and would
have caused a leak if NFSERR_BADXDR had been replied. Probably
never happens, but the leak is now plugged, just in case.
MFC after: 2 weeks
in the readdir functions for non-positive byte count arguments.
For the negative case, set it to the maximum allowable, since it
was actually a large positive value (unsigned) on the wire.
Also, fix up the readdir function comment a bit.
Suggested by: dillon AT apollo.backplane.com
MFC after: 2 weeks
path buffer for one case where it was missing when doing mkdir.
This could have conceivably resulted in a leak of a buffer, but
a leak was never observed during testing, so I suspect it would
have occurred rarely, if ever, in practice.
MFC after: 2 weeks
NFS server for the CREATE cn_nameiop where SAVESTART isn't set.
I was not aware that this needed to be done by the caller until
recently.
Tested by: lampa AT fit.vutbr.cz (link case)
Submitted by: lampa AT fit.vutbr.cz (link case)
MFC after: 2 weeks
on the server for the experimental nfs server. When enabled
by setting vfs.newnfs.locallocks_enable to non-zero, the
experimental nfs server will now acquire byte range locks
on the file on behalf of NFSv4 clients, such that lock
conflicts between the NFSv4 clients and processes running
locally on the server, will be recognized and handled correctly.
MFC after: 2 weeks
- so_pcb is now guaranteed to be non-NULL and valid if a valid socket
reference is held.
- Need to check INP_TIMEWAIT and INP_DROPPED before assuming inp_ppcb is a
tcpcb, as it might be a tcptw or NULL otherwise.
- tp can never be NULL by the end of the function, so only check
TCPS_ESTABLISHED before extracting tcpcb fields.
The NFS server arguably incorporates too many assumptions about TCP
internals, but fixing that is left for nother day.
MFC after: 1 week
Reviewed by: bz
Reviewed and tested by: rmacklem
Sponsored by: Juniper Networks
caching code for IPv6 by fixing a typo that used the incorrect variable.
It also fixes the indentation of the statement above it.
Reported by: simon AT comsys.ntu-kpi.kiev.ua
MFC after: 5 days
was broken w.r.t. byte range lock conflicts when it was the same client
and the request used the open_to_lock_owner4 case, since lckstp->ls_clp
was not set. This patch fixes it by using "clp" instead of "lckstp->ls_clp".
MFC after: 2 weeks
This is necessary in order to enable NFSv4 ACL support. The
argument to nfsvno_accchk() was changed to an accmode_t and
the function nfsrv_aclaccess() was no longer needed and,
therefore, deleted.
Reviewed by: trasz
MFC after: 2 weeks
using VOP_LOOKUP() when VFS_VGET() returns EOPNOTSUPP in the
ReaddirPlus RPC. This patch is based upon one by pjd@ for the
regular nfs server which has not yet been committed. It is needed
when a ZFS volume is exported and ReaddirPlus (which almost
always happens for NFSv4) is performed by a client. The patch
also simplifies vnode lock handling somewhat.
MFC after: 2 weeks
r197525, so that the creation verifier is handled correctly
in va_atime for 64bit architectures. There were two problems.
One was that the code incorrectly assumed that
sizeof (struct timespec) == 8 and the other was that the tv_sec
field needs to be assigned from a signed 32bit integer, so that
sign extension occurs on 64bit architectures. This is required
for correct operation when exporting ZFS volumes.
Reviewed by: pjd
MFC after: 2 weeks
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator. Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...). This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.
Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack. Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory. Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.
Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy. Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address. When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.
This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.
Bump __FreeBSD_version and update UPDATING.
Portions submitted by: bz
Reviewed by: bz, zec
Discussed with: gnn, jamie, jeff, jhb, julian, sam
Suggested by: peter
Approved by: re (kensmith)
NGROUPS_MAX, eliminate ABI dependencies on them, and raise the to 1024
and 1023 respectively. (Previously they were equal, but under a close
reading of POSIX, NGROUPS_MAX was defined to be too large by 1 since it
is the number of supplemental groups, not total number of groups.)
The bulk of the change consists of converting the struct ucred member
cr_groups from a static array to a pointer. Do the equivalent in
kinfo_proc.
Introduce new interfaces crcopysafe() and crsetgroups() for duplicating
a process credential before modifying it and for setting group lists
respectively. Both interfaces take care for the details of allocating
groups array. crsetgroups() takes care of truncating the group list
to the current maximum (NGROUPS) if necessary. In the future,
crsetgroups() may be responsible for insuring invariants such as sorting
the supplemental groups to allow groupmember() to be implemented as a
binary search.
Because we can not change struct xucred without breaking application
ABIs, we leave it alone and introduce a new XU_NGROUPS value which is
always 16 and is to be used or NGRPS as appropriate for things such as
NFS which need to use no more than 16 groups. When feasible, truncate
the group list rather than generating an error.
Minor changes:
- Reduce the number of hand rolled versions of groupmember().
- Do not assign to both cr_gid and cr_groups[0].
- Modify ipfw to cache ucreds instead of part of their contents since
they are immutable once referenced by more than one entity.
Submitted by: Isilon Systems (initial implementation)
X-MFC after: never
PR: bin/113398 kern/133867
and used in a large number of files, but also because an increasing number
of incorrect uses of MAC calls were sneaking in due to copy-and-paste of
MAC-aware code without the associated opt_mac.h include.
Discussed with: pjd
current NFSv4 ACLs, as defined in sys/acl.h. It still needs a
way to test a mount point for NFSv4 ACL support before it will
work. Until then, the NFSHASNFS4ACL() macro just always returns 0.
Approved by: kib (mentor)
that the range of versions of NFS handled by the server can
be limited. The nfsd daemon must be restarted after these
sysctl variables are changed, in order for the change to take
effect.
Approved by: kib (mentor)
that it handles the case of a non-exported NFSv4 root correctly.
Also, delete handling for the case where nd_repstat is already
set in nfs_proc(), since that no longer happens.
Approved by: kib (mentor)
ReleaseLockOwner operations analagous to what is already
in place for SetClientID and SetClientIDConfirm. These are
the five NFSv4 operations that do not use file handle(s),
so the checks are done using the NFSv4 root export entries
in /etc/exports.
Approved by: kib (mentor)
where a client is not allowed NFSv4 access correctly. This
restriction is specified in the "V4: ..." line(s) in
/etc/exports.
Approved by: kib (mentor)
when it runs out of clientids is reboot. I had replaced cpu_reboot()
with printf(), since cpu_reboot() doesn't exist for sparc64.
This change replaces the printf() with panic(), so the reboot
would occur for this highly unlikely occurrence.
Approved by: kib (mentor)
experimental nfsv4 server. It was setting the a_id argument
to a fixed value, but that wasn't sufficient for FreeBSD8.
Instead, set l_pid and l_sysid to 0 plus set the F_REMOTE
flag to indicate that these fields are used to check for
same lock owner. Since, for NFSv4, a lockowner is a ClientID plus
an up to 1024byte name, it can't be put in l_sysid easily.
I also renamed the p variable to td, since it's a thread ptr.
Approved by: kib (mentor)
nfsrv_dolocallocks can be changed via sysctl. I also added some non-empty
descriptor strings and reformatted some overly long lines.
Approved by: kib (mentor)
required two changes: setting the program and version numbers before
connect and fixing the handling of the Null Rpc case in newnfs_request().
Approved by: kib (mentor)
the VFS. Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.
In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.
While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.
VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled. Bump __FreeBSD_version in order to signal such
situation.
Credential might need to hang around longer than its parent and be used
outside of mnt_explock scope controlling netcred lifetime. Use separate
reference-counted ucred allocated separately instead.
While there, extend mnt_explock coverage in vfs_stdexpcheck and clean-up
some unused declarations in new NFS code.
Reported by: John Hickey
PR: kern/133439
Reviewed by: dfr, kib
support for NFSv4 as well as NFSv2 and 3.
It lives in 3 subdirs under sys/fs:
nfs - functions that are common to the client and server
nfsclient - a mutation of sys/nfsclient that call generic functions
to do RPCs and handle state. As such, it retains the
buffer cache handling characteristics and vnode semantics that
are found in sys/nfsclient, for the most part.
nfsserver - the server. It includes a DRC designed specifically for
NFSv4, that is used instead of the generic DRC in sys/rpc.
The build glue will be checked in later, so at this point, it
consists of 3 new subdirs that should not affect kernel building.
Approved by: kib (mentor)