to it for the client side reply. Hopefully this fixes the
problem with using the new krpc for arm for the experimental
nfs client.
Approved by: kib (mentor)
where a client is not allowed NFSv4 access correctly. This
restriction is specified in the "V4: ..." line(s) in
/etc/exports.
Approved by: kib (mentor)
case of a kerberized mount without a host based principal
name. This will only work for mounts being done by a user
other than root. Support for a host based principal name
will not work until proposed changes to the rpcsec_gss part
of the krpc are committed. It now builds for "options KGSSAPI".
Approved by: kib (mentor)
the code will build when "options KGSSAPI" is specified
without requiring the proposed changes that add host based
initiator principal support. It will not handle the case where
the client uses a host based initiator principal until those
changes are committed. The code that uses those changes is
#ifdef'd notyet until the krpc rpcsec_changes are committed.
Approved by: kib (mentor)
use the newer nmount() style arguments, as is used by mount_nfs.c.
This prepares the kernel code for the use of a mount_nfs.c with
changes for the experimental client integrated into it.
Approved by: kib (mentor)
writes upon close when a write delegation is held by the client.
This should be safe to do, now that nfsv4 Close operations are
delayed until ncl_inactive() is called for the vnode.
Approved by: kib (mentor)
when it runs out of clientids is reboot. I had replaced cpu_reboot()
with printf(), since cpu_reboot() doesn't exist for sparc64.
This change replaces the printf() with panic(), so the reboot
would occur for this highly unlikely occurrence.
Approved by: kib (mentor)
the NFSv4 Close operations until ncl_inactive(). This is
necessary so that the Open StateIDs are available for doing
I/O on mmap'd files after VOP_CLOSE(). I also changed some
indentation for the nfscl_getclose() function.
Approved by: kib (mentor)
experimental nfsv4 server. It was setting the a_id argument
to a fixed value, but that wasn't sufficient for FreeBSD8.
Instead, set l_pid and l_sysid to 0 plus set the F_REMOTE
flag to indicate that these fields are used to check for
same lock owner. Since, for NFSv4, a lockowner is a ClientID plus
an up to 1024byte name, it can't be put in l_sysid easily.
I also renamed the p variable to td, since it's a thread ptr.
Approved by: kib (mentor)
nfsrv_dolocallocks can be changed via sysctl. I also added some non-empty
descriptor strings and reformatted some overly long lines.
Approved by: kib (mentor)
required two changes: setting the program and version numbers before
connect and fixing the handling of the Null Rpc case in newnfs_request().
Approved by: kib (mentor)
experimental nfs subsystem from sys/fs/nfs/nfs.h and sys/fs/nfs/nfsproto.h
to sys/fs/nfs/nfsport.h and rename nfsstat to ext_nfsstat. This was done
so that src/usr.bin/nfsstat.c could use it alongside the regular nfs
include files and struct nfsstat.
Approved by: kib (mentor)
before the struct file is fully initialized in vn_open(), in particular,
fp->f_vnode is NULL. Other thread calling file operation before f_vnode
is set results in NULL pointer dereference in devvn_refthread().
Initialize f_vnode before calling d_fdopen() cdevsw method, that might
set file ops too.
Reported and tested by: Chris Timmons <cwt networks cwu edu>
(RELENG_7 version)
MFC after: 3 days
major/minor numbers of the devices.
Pretending that the vnodes are character devices prevents file tree
walkers from descending into the directories opened by current process.
Also, not doing stat on the filedescriptors prevents the recursive entry
into the VFS.
Requested by: kientzle
Discussed with: Jilles Tjoelker <jilles stack nl>
representing too large integer, instead of overflowing and possibly
returning a random but valid vnode.
Noted by: Jilles Tjoelker <jilles stack nl>
MFC after: 3 days
the VFS. Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.
In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.
While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.
VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled. Bump __FreeBSD_version in order to signal such
situation.
Credential might need to hang around longer than its parent and be used
outside of mnt_explock scope controlling netcred lifetime. Use separate
reference-counted ucred allocated separately instead.
While there, extend mnt_explock coverage in vfs_stdexpcheck and clean-up
some unused declarations in new NFS code.
Reported by: John Hickey
PR: kern/133439
Reviewed by: dfr, kib
support for NFSv4 as well as NFSv2 and 3.
It lives in 3 subdirs under sys/fs:
nfs - functions that are common to the client and server
nfsclient - a mutation of sys/nfsclient that call generic functions
to do RPCs and handle state. As such, it retains the
buffer cache handling characteristics and vnode semantics that
are found in sys/nfsclient, for the most part.
nfsserver - the server. It includes a DRC designed specifically for
NFSv4, that is used instead of the generic DRC in sys/rpc.
The build glue will be checked in later, so at this point, it
consists of 3 new subdirs that should not affect kernel building.
Approved by: kib (mentor)
the removal of NQNFS, but was left in in case it was required for NFSv4.
Since our new NFSv4 client and server can't use it for their
requirements, GC the old mechanism, as well as other unused lease-
related code and interfaces.
Due to its impact on kernel programming and binary interfaces, this
change should not be MFC'd.
Proposed by: jeff
Reviewed by: jeff
Discussed with: rmacklem, zach loafman @ isilon
filesystem supports additional operations using shared vnode locks.
Currently this is used to enable shared locks for open() and close() of
read-only file descriptors.
- When an ISOPEN namei() request is performed with LOCKSHARED, use a
shared vnode lock for the leaf vnode only if the mount point has the
extended shared flag set.
- Set LOCKSHARED in vn_open_cred() for requests that specify O_RDONLY but
not O_CREAT.
- Use a shared vnode lock around VOP_CLOSE() if the file was opened with
O_RDONLY and the mountpoint has the extended shared flag set.
- Adjust md(4) to upgrade the vnode lock on the vnode it gets back from
vn_open() since it now may only have a shared vnode lock.
- Don't enable shared vnode locks on FIFO vnodes in ZFS and UFS since
FIFO's require exclusive vnode locks for their open() and close()
routines. (My recent MPSAFE patches for UDF and cd9660 already included
this change.)
- Enable extended shared operations on UFS, cd9660, and UDF.
Submitted by: ups
Reviewed by: pjd (ZFS bits)
MFC after: 1 month
implementation instead. The bypass does not assume that returned vnode
is only held.
Reported by: Paul B. Mahol <onemda gmail com>, pluknet <pluknet gmail com>
Reviewed by: jhb
Tested by: pho, pluknet <pluknet gmail com>
poll_no_poll().
Return a poll_no_poll() result from devfs_poll_f() when
filedescriptor does not reference the live cdev, instead of ENXIO.
Noted and tested by: hps
MFC after: 1 week
We fail mapping for any udf_bmap_internal error and there can be
different reasons for it, so no need to (over-)emphasize files with
data in fentry.
Submitted by: bde
Approved by: jhb
Previosly readdir missed some directory entries because there was
no space for them in current uio but directory stream offset
was advanced nevertheless.
jhb has discoved the issue and provided a test-case.
Reviewed by: bde
Approved by: jhb (mentor)
Currently bread()-ing through device vnode with
(1) VMIO enabled,
(2) bo_bsize != DEV_BSIZE
(3) more than 1 block
results in data being incorrectly cached.
So instead a more common approach of using a vnode belonging to fs is now
employed.
Also, prevent attempt to bread more than MAXBSIZE bytes because of
adjustments made to account for offset that doesn't start on block
boundary.
Add expanded comments to explain the calculations.
Also drop unused inline function while here.
PR: kern/120967
PR: kern/129084
Reviewed by: scottl, kib
Approved by: jhb (mentor)
... as opposed to files with data in extents.
Some UDF authoring tools produce this type of file for sufficiently small
data files.
Approved by: jhb (mentor)
Use bit-shift instead of division/multiplication.
Act on error as soon as it is detected.
Report attempt to read data embedded in file entry via regular way.
While there, fix lblktosize macro and make use of it.
No functionality should change as a result.
Approved by: jhb (mentor)
msdosfs_unmount() and ffs_unmount() exit early after getting ENXIO.
However, dounmount() treats ENXIO as a success and proceeds with
unmounting. In effect, the filesystem gets unmounted without closing
GEOM provider etc.
Reviewed by: kib
Approved by: rwatson (mentor)
Tested by: dho
Sponsored by: FreeBSD Foundation
Not enough space in user-land buffer is not an error, userland
will read further until eof is reached. So instead of propagating
-1 to caller we convert it to zero/success.
cd9660 code works exactly the same way.
PR: kern/78987
Reviewed by: jhb (mentor)
Approved by: jhb (mentor)
- Always read the character device pointer while the associated devfs vnode
is locked. Also, use dev_ref() to obtain a new reference on the vnode for
the mountpoint. This reference is released on unmount. This mirrors the
earlier fix to FFS.
Reviewed by: kib
lookups:
- Honor the caller's locking flags in udf_root() and udf_vget().
- Set VV_ROOT for the root vnode in udf_vget() instead of only doing it in
udf_root().
- Honor the requested locking flags during pathname lookups in udf_lookup().
- Release the buffer holding the directory data before looking up the vnode
for a given file to avoid a LOR between the "udf" vnode locks and
"bufwait".
- Use vn_vget_ino() to handle ".." lookups.
- Special case "." lookups instead of calling udf_vget(). We have to do
extra checking for the vnode lock for "." lookups.
both node pointer and name component. This does the right thing for
hardlinks to the same node in the same directory.
Submitted by: Yoshihiro Ota <ota j email ne jp>
PR: kern/131356
MFC after: 2 weeks
sequence of pathname components. We walk the list building a string in
the caller's passed in buffer. Currently this only handles path names
in CS8 (character set 8) as that is what mkisofs generates for UDF images.
MFC after: 1 month
- Add a separate set of vnode operations that inherits from the fifo ops
and use it for fifo nodes.
- Add a VOP_SETATTR() method that allows setting the size (by silently
ignoring the requests) of fifos. This is to allow O_TRUNC opens of
fifo devices (e.g. I/O redirection in shells using ">").
- Add a VOP_PRINT() handler while I'm here.
- Align the fifo output in fifo_print() with other vn_printf() output.
- Remove the leading space from lockmgr_printinfo() so its output lines up
in vn_printf().
- lockmgr_printinfo() now ends with a newline, so remove an extra newline
from vn_printf().
So we are no longer interested in the error returned from
the *fs_doio() functions. With that we can remove the
error variable as its value is unused now.
Submitted by: Christoph Mallon christoph.mallon@gmx.de
After running a `make buildkernel', I noticed most of the Giant locks in
sysctl are only caused by a very small amount of sysctl's:
- sysctl.name2oid. This one is locked by SYSCTL_LOCK, just like
sysctl.oidfmt.
- kern.ident, kern.osrelease, kern.version, etc. These are just constant
strings.
- kern.arandom, used by the stack protector. It is already protected by
arc4_mtx.
I also saw the following sysctl's show up. Not as often as the ones
above, but still quite often:
- security.jail.jailed. Also mark security.jail.list as MPSAFE. They
don't need locking or already use allprison_lock.
- kern.devname, used by devname(3), ttyname(3), etc.
This seems to reduce Giant locking inside sysctl by ~75% in my primitive
test setup.
pathname lookups.
- Remove 'i_offset' and 'i_ino' from the ISO node structure and replace
them with local variables in the lookup routine instead.
- Cache a copy of 'i_diroff' for use during a lookup in a local variable.
- Save a copy of the found directory entry in a malloc'd buffer after a
successfull lookup before getting the vnode. This allows us to release
the buffer holding the directory block before calling vget() which
otherwise resulted in a LOR between "bufwait" and the vnode lock.
- Use an inlined version of vn_vget_ino() to handle races with ..
lookups. I had to inline the code here since cd9660 uses an internal
vget routine to save a disk I/O that would otherwise re-read the
directory block.
- Honor the requested locking flags during lookups to allow for shared
locking.
- Honor the requested locking flags passed to VFS_ROOT() and VFS_VGET()
similar to UFS.
- Don't make every ISO 9660 vnode hold a reference on the vnode of the
underlying device vnode of the mountpoint. The mountpoint already
holds a suitable reference.
Inside the kernel, the minor() function was responsible for obtaining
the device minor number of a character device. Because we made device
numbers dynamically allocated and independent of the unit number passed
to make_dev() a long time ago, it was actually a misnomer. If you really
want to obtain the device number, you should use dev2udev().
We already converted all the drivers to use dev2unit() to obtain the
device unit number, which is still used by a lot of drivers. I've
noticed not a single driver passes NULL to dev2unit(). Even if they
would, its behaviour would make little sense. This is why I've removed
the NULL check.
Ths commit removes minor(), minor2unit() and unit2minor() from the
kernel. Because there was a naming collision with uminor(), we can
rename umajor() and uminor() back to major() and minor(). This means
that the makedev(3) manual page also applies to kernel space code now.
I suspect umajor() and uminor() isn't used that often in external code,
but to make it easier for other parties to port their code, I've
increased __FreeBSD_version to 800062.
without corresponding number of fifo_open(). This causes assertion
failure in fifo_close() due to vp->v_fifoinfo being NULL for kernel
with INVARIANTS, or NULL pointer dereference otherwise. In fact, we may
ignore excess calls to fifo_close() without bad consequences.
Turn KASSERT() into the return, and print warning for now.
Tested by: pho
Reviewed by: rwatson
MFC after: 2 weeks
to be caused by a metadata corruption that occurs quite often after
unplugging a pendrive during write activity.
Reviewed by: scottl
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation
pfs_vncache_free() that removes pvd from the list, while it is not yet
put on the list.
Prevent the invalid removal from the list by clearing pvd_next and
pvd_prev for the newly allocated pvd, and only move pfs_vncache list
head when the pvd was at the head.
Suggested and approved by: des
MFC after: 2 weeks
compare map->timestamp with saved timestamp after map read lock is
reacquired, not with saved timestamp + 1. The only consequence of the +1
was unconditional lookup of the next map entry, though.
Tested by: pho
Approved by: des
MFC after: 2 weeks
may need to lock arbitrary vnodes, causing either lock order reversal or
recursive vnode lock acquisition.
Tested by: pho
Approved by: des
MFC after: 2 weeks
do pfs_vncache_alloc() for the same pfs_node and pid. In this case, we
could end up with two vnodes for the pair. Recheck the cache under the
locked pfs_vncache_mutex after all sleeping operations are done [1].
This case mostly cannot happen now because pseudofs uses exclusive vnode
locking for lookup. But it does drop the vnode lock for dotdot lookups,
and Marcus' pseudofs_vptocnp implementation is vulnerable too.
Do not call free() on the struct pfs_vdata after insmntque() failure,
because vp->v_data points to the structure, and pseudofs_reclaim()
frees it by the call to pfs_vncache_free().
Tested by: pho [1]
Approved by: des
MFC after: 2 weeks
anything other than 0. Make it so. This fixes
"panic: VOP_STRATEGY failed bp=0xc320dd90 vp=0xc3b9f648",
encountered when writing to an orphaned filesystem. Reason
for the panic was the following assert:
KASSERT(i == 0, ("VOP_STRATEGY failed bp=%p vp=%p", bp, bp->b_vp));
at vfs_bio:bufstrategy().
Reviewed by: scottl, phk
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation