Commit Graph

2916 Commits

Author SHA1 Message Date
jhb
ecb4042c11 Use the VA_UTIMES_NULL flag to detect when NULL was passed to utimes()
instead of comparing the desired time against the current time as a
heuristic.

Reviewed by:	rmacklem
MFC after:	1 week
2013-01-16 21:52:31 +00:00
kib
82d5c5773a Remove the filtering of the acceptable mount options for nullfs, added
in r245004.  Although the report was for noatime option which is
non-functional for the nullfs, other standard options like nosuid or
noexec are useful with it.

Reported by:	Dewayne Geraghty <dewayne.geraghty@heuristicsystems.com.au>
MFC after:	3 days
2013-01-16 05:32:49 +00:00
jhb
e7637960eb - More properly handle interrupted NFS requests on an interruptible mount
by returning an error of EINTR rather than EACCES.
- While here, bring back some (but not all) of the NFS RPC statistics lost
  when krpc was committed.

Reviewed by:	rmacklem
MFC after:	1 week
2013-01-15 22:08:17 +00:00
kib
169c0a0a9f The current default size of the nullfs hash table used to lookup the
existing nullfs vnode by the lower vnode is only 16 slots.  Since the
default mode for the nullfs is to cache the vnodes, hash has extremely
huge chains.

Size the nullfs hashtbl based on the current value of
desiredvnodes. Use vfs_hash_index() to calculate the hash bucket for a
given vnode.

Pointy hat to:	    kib
Diagnosed and reviewed by:	peter
Tested by:    peter, pho (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	5 days
2013-01-14 05:44:47 +00:00
kib
b94d892898 When nullfs mount is forcibly unmounted and nullfs vnode is reclaimed,
get back the leased write reference from the lower vnode.  There is no
other path which can correct v_writecount on the lowervp.

Reported by:	flo
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2013-01-10 18:24:48 +00:00
bapt
afe1d4e213 Add support for IO_APPEND flag in fuse
This make open(..., O_APPEND) actually works on fuse filesystem.

Reviewed by:	attilio
2013-01-08 12:21:50 +00:00
pfg
947a420026 ext2fs: cleanup de dinode structure.
It was plagued with style errors and the offsets had been lost.
While here took the time to update the fields according to the
latest ext4 documentation.

Reviewed by:	bde
MFC after:	3 days
2013-01-07 03:36:32 +00:00
gleb
97e76936ec tmpfs: Replace directory entry linked list with RB-Tree.
Use file name hash as a tree key, handle duplicate keys.  Both VOP_LOOKUP
and VOP_READDIR operations utilize same tree for search.  Directory
entry offset (cookie) is either file name hash or incremental id in case
of hash collisions (duplicate-cookies).  Keep sorted per directory list
of duplicate-cookie entries to facilitate cookie number allocation.

Don't fail if previous VOP_READDIR() offset is no longer valid, start
with next dirent instead.  Other file system handle it similarly.

Workaround race prone tn_readdir_last[pn] fields update.

Add tmpfs_dir_destroy() to free all dirents.

Set NFS cookies in tmpfs_dir_getdents(). Return EJUSTRETURN from
tmpfs_dir_getdents() instead of hard coded -1.

Mark directory traversal routines static as they are no longer
used outside of tmpfs_subr.c
2013-01-06 22:15:44 +00:00
kib
f74da69096 Fix reversed condition in the assertion.
Pointy hat to:	kib
MFC after:	13 days
2013-01-04 07:52:47 +00:00
kib
a7c71037df Add the "nocache" nullfs mount option, which disables the caching of
the free nullfs vnodes, switching nullfs behaviour to pre-r240285.
The option is mostly intended as the last-resort when higher pressure
on the vnode cache due to doubling of the vnode counts is not
desirable.

Note that disabling the cache costs more than 2x wall time in the
metadata-hungry scenarious.  The default is "cache".

Tested and benchmarked by:	pho (previous version)
MFC after:	2 weeks
2013-01-03 19:17:57 +00:00
kib
defbe57abe Remove the last use of the deprecated MNT_VNODE_FOREACH interface in
the tree.

With the help from:	mjg
Tested by:	Ronald Klop <ronald-freebsd8@klop.yi.org>
MFC after:	2 weeks
2013-01-03 19:01:56 +00:00
kib
c6bad3bef7 Do not force a writer to the devfs file to drain the buffer writes.
Requested and tested by:	Ian Lepore <freebsd@damnhippie.dyndns.org>
MFC after:	2 weeks
2012-12-23 22:43:27 +00:00
pfg
16216f308e More constant renaming in preparation for newer features.
We also try to make better use of the fs flags instead of
trying adapt the code according to the fs structures. In
the case of subsecond timestamps and birthtime we now
check that the feature is explicitly enabled: previously
we only checked that the reserved space was available and
silently wrote them.

This approach is much safer, especially if the filesystem
happens to use embedded inodes or support EAs.

Discussed with:	Zheng Liu
MFC after:	5 days
2012-12-20 02:22:36 +00:00
rmacklem
1c6e1dc79b Add "nfsstat -m" support for the two new NFS mount options
added by r244042.
2012-12-09 22:23:50 +00:00
rmacklem
c82d89183d Move the NFSv4.1 client patches over from projects/nfsv4.1-client
to head. I don't think the NFS client behaviour will change unless
the new "minorversion=1" mount option is used. It includes basic
NFSv4.1 support plus support for pNFS using the Files Layout only.
All problems detecting during an NFSv4.1 Bakeathon testing event
in June 2012 have been resolved in this code and it has been tested
against the NFSv4.1 server available to me.
Although not reviewed, I believe that kib@ has looked at it.
2012-12-08 22:52:39 +00:00
glebius
8e20fa5ae9 Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually
2012-12-05 08:04:20 +00:00
rmacklem
d79bf0f49f Add an nfssvc() option to the kernel for the new NFS client
which dumps out the actual options being used by an NFS mount.
This will be used to implement a "-m" option for nfsstat(1).

Reviewed by:	alfred
MFC after:	2 weeks
2012-12-02 01:16:04 +00:00
pfg
b4fee55cdf Update some definitions or make them match NetBSD's headers.
Bring several definitions required for newer ext4 features.

Rename EXT2F_COMPAT_HTREE to EXT2F_COMPAT_DIRHASHINDEX since it
is not being used yet and the new name is more compatible with
NetBSD and Linux.

This change is purely cosmetic and has no effect on the real
code.

Obtained from:	NetBSD
MFC after:	3 days
2012-11-28 15:48:32 +00:00
pfg
0077f34174 Partially bring r242520 to ext2fs.
When a file is first being written, the dynamic block reallocation
(implemented by ext2_reallocblks) relocates the file's blocks
so as to cluster them together into a contiguous set of blocks on
the disk.

When the cluster crosses the boundary into the first indirect block,
the first indirect block is initially allocated in a position
immediately following the last direct block.  Block reallocation
would usually destroy locality by moving the indirect block out of
the way to keep the data blocks contiguous.

The issue was diagnosed long ago by Bruce Evans on ffs and surfaced
on ext2fs when block reallocaton was ported. This is only a partial
solution based on the similarities with FFS. We still require more
review of the allocation details that vary in ext2fs.

Reported by:	bde
MFC after:	1 week
2012-11-28 00:36:40 +00:00
davide
f3a37c7422 - smbfs_rename() might return an error value without correctly upgrading
the vnode use count, and this might cause the kernel to panic if compiled
with WITNESS enable.
- Be sure to put the '\0' terminator to the rpath string.

Sponsored by:	iXsystems inc.
2012-11-26 04:29:47 +00:00
davide
e52463677a - Remove reset of vpp pointer in some places as long as it's not really
useful and has the side effect of obfuscating the code a bit.
- Remove spurious references to simple_lock.

Reported by:	attilio [1]
Sponsored by:	iXsystems inc.
2012-11-22 09:13:45 +00:00
davide
017de7d030 Until now, smbfs_fullpath() computed the full path starting from the
vnode and following back the chain of n_parent pointers up to the root,
without acquiring the locks of the n_parent vnodes analyzed during the
computation. This is immediately wrong because if the vnode lock is not
held there's no guarantee on the validity of the vnode pointer or the data.
In order to fix, store the whole path in the smbnode structure so that
smbfs_fullpath() can use this information.

Discussed with:		kib
Reported and tested by:		pho
Sponsored by:		iXsystems inc.
2012-11-22 08:58:29 +00:00
kib
f31aa350da Remove the check and panic for an impossible condition. The NULL
lowervp vnode v_vnlock would cause panic due to NULL pointer
dereference much earlier.

MFC after:	1 week
2012-11-20 15:25:00 +00:00
attilio
b9a061c88d r16312 is not any longer real since many years (likely since when VFS
received granular locking) but the comment present in UFS has been
copied all over other filesystems code incorrectly for several times.

Removes comments that makes no sense now.

Reviewed by:	kib
MFC after:	3 days
2012-11-19 22:43:45 +00:00
kib
801de09716 In pget(9), if PGET_NOTWEXIT flag is not specified, also search the
zombie list for the pid. This allows several kern.proc sysctls to
report useful information for zombies.

Hold the allproc_lock around all searches instead of relocking it.
Remove private pfind_locked() from the new nfs client code.

Requested and reviewed by:	pjd
Tested by:	pho
MFC after:	3 weeks
2012-11-16 08:25:06 +00:00
kib
32941467f0 Remove M_USE_RESERVE from the devfs cdp allocator, which is one of two
uses of M_USE_RESERVE in the kernel. This allocation is not special.

Reviewed by:	alc
Tested by:	pho
MFC after:	2 weeks
2012-11-14 19:50:21 +00:00
davide
b4e3eb0677 Get rid of some old debug code. It provides checks similar to the one
offered by RedZone so there's no need to keep it.

Sponsored by:	iXsystems inc.
2012-11-14 19:10:50 +00:00
davide
d7bac61929 Fix the lookup in the DOTDOT case in the same way as other filesystems do,
i.e. inlining the vn_vget_ino() algorithm.

Sponsored by:	iXsystems inc.
2012-11-14 18:43:58 +00:00
attilio
0a289d546b - Protect mnt_data and mnt_flags under the mount interlock
- Move mp->mnt_stat manipulation where all of them happens

Reported by:	davide
Discussed with:	kib
Tested by:	flo
MFC after:	2 months
X-MFC:		241519, 242536,242616, 242727
2012-11-10 19:32:16 +00:00
attilio
d5d551ec46 Complete MPSAFE VFS interface and remove MNTK_MPSAFE flag.
Porters should refer to __FreeBSD_version 1000021 for this change as
it may have happened at the same timeframe.
2012-11-09 18:02:25 +00:00
attilio
1e93fc5eeb - Current caching mode is completely broken because it simply relies
on timing of the operations and not real lookup, bringing too many
  false positives. Remove the whole mechanism. If it needs to be
  implemented, next time it should really be done in the proper way.
- Fix VOP_GETATTR() in order to cope with userland bugs that would
  change the type of file and not panic. Instead it gets the entry as
  if it is not existing.

Reported and tested by:	flo
MFC after:	2 months
X-MFC:		241519, 242536,242616
2012-11-08 00:32:49 +00:00
attilio
908519dd89 fuse_io* must be able to crunch also VDIR vnodes.
Update assert appropriately.

Reported and Tested by:	flo
MFC after:	2 months
X-MFC:		241519,242536
2012-11-05 15:23:54 +00:00
attilio
35cb5e8a85 Fix a bug where operations was carried on even if not implemented,
leading to handling of an invalid fdip object.

Reported and tested by:	flo
MFC after:	2 months
X-MFC:		241519
2012-11-03 23:32:32 +00:00
kib
f16ea99007 The r241025 fixed the case when a binary, executed from nullfs mount,
was still possible to open for write from the lower filesystem.  There
is a symmetric situation where the binary could already has file
descriptors opened for write, but it can be executed from the nullfs
overlay.

Handle the issue by passing one v_writecount reference to the lower
vnode if nullfs vnode has non-zero v_writecount.  Note that only one
write reference can be donated, since nullfs only keeps one use
reference on the lower vnode.  Always use the lower vnode v_writecount
for the checks.

Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is
currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT
to manipulate the v_writecount value, which manages a single bypass
reference to the lower vnode.  Caling the VOPs instead of directly
accessing v_writecount provide the fix described in the previous
paragraph.

Tested by:	pho
MFC after:	3 weeks
2012-11-02 13:56:36 +00:00
davide
ff8afcf2e9 - Do not put in the mntqueue half-constructed vnodes.
- Change the code so that it relies on vfs_hash rather than on a
  home-made hashtable.
- There's no need to inline fnv_32_buf().

Reviewed by:	delphij
Tested by:	pho
Sponsored by:	iXsystems inc.
2012-10-31 03:55:33 +00:00
davide
793cdde76e Fix panic due to page faults while in kernel mode, under conditions of
VM pressure. The reason is that in some codepaths pointers to stack
variables were passed from one thread to another.

In collaboration with:	pho
Reported by:	pho's stress2 suite
Sponsored by:	iXsystems inc.
2012-10-31 03:34:07 +00:00
davide
a7cdc19e4b Change the code to use %jd as printf() placeholder for uio_offset and
cast to intmax_t.

Suggested by:	pjd
Sponsored by:	iXsystems inc.
2012-10-31 02:54:44 +00:00
davide
1206789da9 Fix build in case we have SMBVDEBUG turned on.
Reviewed by:	gnn
Approved by:	gnn
Sponsored by:	iXsystems inc.
2012-10-25 21:08:02 +00:00
davide
bac20b679d - Remove the references to the deprecated zalloc kernel interface
- Use M_ZERO flag in malloc() rather than bzero()
- malloc() with M_NOWAIT can't return NULL so there's no need to check

Reviewed by:	alc
Approved by:	alc
2012-10-25 20:23:04 +00:00
kib
560aa751e0 Remove the support for using non-mpsafe filesystem modules.
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.

The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.

Conducted and reviewed by:	attilio
Tested by:	pho
2012-10-22 17:50:54 +00:00
eadler
3f7a414911 remove duplicate semicolons where possible.
Approved by:	cperciva
MFC after:	1 week
2012-10-22 03:00:37 +00:00
ed
1fb45e9f39 Remove unneeded D_NEEDMINOR.
This is only needed when using clonelists. This got remove in r238693.
2012-10-18 19:28:31 +00:00
rmacklem
813fc27188 Add two new options to the nfssvc(2) syscall that allow
processes running as root to suspend/resume execution
of the kernel nfsd threads. An earlier version of this
patch was tested by Vincent Hoffman (vince at unsane.co.uk)
and John Hickey (jh at deterlab.net).

Reviewed by:	kib
MFC after:	2 weeks
2012-10-14 22:33:17 +00:00
kib
f327177d6f Grammar fixes.
Submitted by:	bf
MFC after:	1 week
2012-10-14 18:13:33 +00:00
kib
7cc759236b Replace the XXX comment with the proper description.
MFC after:	1 week
2012-10-14 17:07:34 +00:00
attilio
95ca52bfbf Rename s/DEBUG()/FS_DEBUG() and s/DEBUG2G()/FS_DEBUG2G() in order to
avoid a name clash in sparc64.

MFC after:	2 months
X-MFC:		r241519
2012-10-14 03:51:59 +00:00
attilio
af2d834e29 Import a FreeBSD port of the FUSE Linux module.
This has been developed during 2 summer of code mandates and being revived
by gnn recently.
The functionality in this commit mirrors entirely content of fusefs-kmod
port, which doesn't need to be installed anymore for -CURRENT setups.

In order to get some sparse technical notes, please refer to:
http://lists.freebsd.org/pipermail/freebsd-fs/2012-March/013876.html

or to the project branch:
svn://svn.freebsd.org/base/projects/fuse/

which also contains granular history of changes happened during port
refinements. This commit does not came from the branch reintegration
itself because it seems svn is not behaving properly for this functionaly
at the moment.

Partly Sponsored by:		Google, Summer of Code program 2005, 2011
Originally submitted by:	ilya, Csaba Henk <csaba-ml AT creo DOT hu >
In collabouration with:		pho
Tested by:			flo, gnn, Gustau Perez,
				Kevin Oberman <rkoberman AT gmail DOT com>
MFC after:			2 months
2012-10-13 23:54:26 +00:00
kib
8f845e475e Fix the mis-handling of the VV_TEXT on the nullfs vnodes.
If you have a binary on a filesystem which is also mounted over by
nullfs, you could execute the binary from the lower filesystem, or
from the nullfs mount. When executed from lower filesystem, the lower
vnode gets VV_TEXT flag set, and the file cannot be modified while the
binary is active. But, if executed as the nullfs alias, only the
nullfs vnode gets VV_TEXT set, and you still can open the lower vnode
for write.

Add a set of VOPs for the VV_TEXT query, set and clear operations,
which are correctly bypassed to lower vnode.

Tested by:	pho (previous version)
MFC after:	2 weeks
2012-09-28 11:25:02 +00:00
mdf
394f27b845 Fix up kernel sources to be ready for a 64-bit ino_t.
Original code by:	Gleb Kurtsou
2012-09-27 23:30:49 +00:00
rmacklem
c071417ab7 Modify the NFSv4 client so that it can handle owner
and owner_group strings that consist entirely of
digits, interpreting them as the uid/gid number.
This change was needed since new (>= 3.3) Linux
servers reply with these strings by default.
This change is mandated by the rfc3530bis draft.
Reported on freebsd-stable@ under the Subject
heading "Problem with Linux >= 3.3 as NFSv4 server"
by Norbert Aschendorff on Aug. 20, 2012.

Tested by:	norbert.aschendorff at yahoo.de
Reviewed by:	jhb
MFC after:	2 weeks
2012-09-20 02:49:25 +00:00