Commit Graph

2812 Commits

Author SHA1 Message Date
Jaakko Heinonen
295a542d96 Apply changes from r234103 to ext2fs:
Return EPERM from ext2_setattr() when an user without PRIV_VFS_SYSFLAGS
privilege attempts to toggle SF_SETTABLE flags.

Flags are now stored to ip->i_flags in one place after all checks.

Also, remove SF_NOUNLINK from the checks because ext2fs doesn't support
that flag.

Reviewed by:	bde
2012-04-13 05:48:31 +00:00
Jaakko Heinonen
e6b8bdf252 Restore the blank line incorrectly removed in r234104.
Pointed out by:	bde
2012-04-11 15:48:50 +00:00
Jaakko Heinonen
034efc61ba Apply changes from r233787 to ext2fs:
- Use more natural ip->i_flags instead of vap->va_flags in the final
  flags check.
- Style improvements.

No functional change intended.

MFC after:	2 weeks
2012-04-10 16:05:52 +00:00
Attilio Rao
a0f2c37b6f - Introduce a cache-miss optimization for consistency with other
accesses of the cache member of vm_object objects.
- Use novel vm_page_is_cached() for checks outside of the vm subsystem.

Reviewed by:	alc
MFC after:	2 weeks
X-MFC:		r234039
2012-04-09 17:05:18 +00:00
Kirk McKusick
827e334c01 Add I/O accounting to msdos filesystem.
Suggested and reviewed by: kib
2012-04-08 06:18:18 +00:00
Gleb Kurtsou
9295c62814 tmpfs supports only INT_MAX nodes due to limitations of unit number
allocator.

Replace UINT32_MAX checks with INT_MAX. Keeping more than 2^31 nodes in
memory is not likely to become possible in foreseeable feature and would
require new unit number allocator.

Discussed with: delphij
MFC after:	2 weeks
2012-04-07 15:30:46 +00:00
Gleb Kurtsou
0ff93c48da Add vfs_getopt_size. Support human readable file system options in tmpfs.
Increase maximum tmpfs file system size to 4GB*PAGE_SIZE on 32 bit archs.

Discussed with:	delphij
MFC after:	2 weeks
2012-04-07 15:27:34 +00:00
Gleb Kurtsou
da7aa2778e Add reserved memory limit sysctl to tmpfs.
Cleanup availble and used memory functions.
Check if free pages available before allocating new node.

Discussed with:	delphij
2012-04-07 15:23:51 +00:00
Konstantin Belousov
a53373fabe Add sysctl vfs.nfs.nfs_keep_dirty_on_error to switch the nfs client
behaviour on error from write RPC back to behaviour of old nfs client.
When set to not zero, the pages for which write failed are kept dirty.

PR:	kern/165927
Reviewed by:	alc
MFC after:	2 weeks
2012-03-17 23:03:20 +00:00
Gleb Kurtsou
db94ad126a Prevent tmpfs_rename() deadlock in a way similar to UFS
Unlock vnodes and try to lock them one by one. Relookup fvp and tvp.

Approved by:	mdf (mentor)
2012-03-14 09:15:50 +00:00
Gleb Kurtsou
ca846258e2 Don't enforce LK_RETRY to get existing vnode in tmpfs_alloc_vp()
Doomed vnode is hardly of any use here, besides all callers handle error
case. vfs_hash_get() does the same.

Don't mess with vnode holdcount, vget() takes care of it already.

Approved by:	mdf (mentor)
2012-03-14 08:29:21 +00:00
Kevin Lo
11753bd018 Use NULL instead of 0 2012-03-13 10:04:13 +00:00
Konstantin Belousov
0e738b4c0f Update comment.
Submitted by:	gianni
2012-03-11 15:58:27 +00:00
Konstantin Belousov
b80dcb55aa Remove fifo.h. The only used function declaration from the header is
migrated to sys/vnode.h.

Submitted by:	gianni
2012-03-11 12:19:58 +00:00
Pedro F. Giffuni
035e4e0494 Add support for ns timestamps and birthtime to the ext2/3 driver.
When using big inodes there is sufficient space in ext3 to
keep extra resolution and birthtime (creation) timestamps.
The appropriate fields in the on-disk inode have been approved
for a long time but support for this in ext3 has not been
widely  distributed.

In preparation for ext4 most linux distributions have enabled
by default such bigger inodes and some people use nanosecond
timestamps in ext3. We now support those when the inode is big
enough and while we do recognize the EXT4F_ROCOMPAT_EXTRA_ISIZE,
we maintain the extra timestamps even when they are not used.

An additional note by Bruce Evans:
We blindly accept unrepresentable tv_nsec in VOP_SETATTR(), but
all file  systems have always done that.  When POSIX gets around
to  specifying the behaviour, it will probably require certain
rounding to the fs's resolution and not rejecting the request.
This unfortunately means that syscalls that set times can't
really tell if they succeeded without reading back the times
using stat() or similar and checking that they were set close
enough.

Reviewed by:	bde
Approved by:	jhb (mentor)
MFC after:	2 weeks
2012-03-08 21:06:05 +00:00
John Baldwin
b47f624183 Add KTR_VFS traces to track modifications to a vnode's writecount. 2012-03-08 20:27:20 +00:00
Konstantin Belousov
f950879e16 The pipe_poll() performs lockless access to the vnode to test
fifo_iseof() condition, allowing the v_fifoinfo to be reset and freed
by fifo_cleanup().

Precalculate EOF at the places were fo_wgen is changed, and cache the
state in a new pipe state flag PIPE_SAMEWGEN.

Reported and tested by:	bf
Submitted by:	gianni
MFC after:	1 week (a backport)
2012-03-07 07:31:50 +00:00
Konstantin Belousov
31452ff75e Apply inlined vn_vget_ino() algorithm for ".." lookup in pseudofs.
Reported and tested by:	pho
MFC after:	2 weeks
2012-03-05 11:38:02 +00:00
Konstantin Belousov
ea4072446b Remove unneeded cast to u_int. The values as small enough to fit into
int, beside the use of MIN macro which performs type promotions.

Submitted by:	bde
MFC after:	3 weeks
2012-03-04 14:51:42 +00:00
Kevin Lo
c225ad032d Remove unnecessary casts 2012-03-04 09:48:58 +00:00
Kevin Lo
dd104b3305 Clean up style(9) nits 2012-03-04 09:38:20 +00:00
Rick Macklem
b76ec2db93 The name caching changes of r230394 exposed an intermittent bug
in the new NFS server for NFSv4, where it would report ENOENT
when the file actually existed on the server. This turned out
to be caused by not initializing ni_topdir before calling lookup()
and there was a rare case where the value on the stack location
assigned to ni_topdir happened to be a pointer to a ".." entry,
such that "dp == ndp->ni_topdir" succeeded in lookup().
This patch initializes ni_topdir to fix the problem.

MFC after:	5 days
2012-03-03 16:13:20 +00:00
Rick Macklem
5e99212d36 Post r230394, the Lookup RPC counts for both NFS clients increased
significantly. Upon investigation this was caused by name cache
misses for lookups of "..". For name cache entries for non-".."
directories, the cache entry serves double duty. It maps both the
named directory plus ".." for the parent of the directory. As such,
two ctime values (one for each of the directory and its parent) need
to be saved in the name cache entry.
This patch adds an entry for ctime of the parent directory to the
name cache. It also adds an additional uma zone for large entries
with this time value, in order to minimize memory wastage.
As well, it fixes a couple of cases where the mtime of the parent
directory was being saved instead of ctime for positive name cache
entries. With this patch, Lookup RPC counts return to values similar
to pre-r230394 kernels.

Reported by:	bde
Discussed with:	kib
Reviewed by:	jhb
MFC after:	2 weeks
2012-03-03 01:06:54 +00:00
John Baldwin
58d65e8031 Similar to the fixes in 226967 and 226987, purge any name cache entries
associated with the previous vnode (if any) associated with the target of
a rename().  Otherwise, a lookup of the target pathname concurrent with a
rename() could re-add a name cache entry after the namei(RENAME) lookup
in kern_renameat() had purged the target pathname.

MFC after:	2 weeks
2012-03-02 18:55:19 +00:00
Konstantin Belousov
66f02f4b25 Do not expose unlocked unconstructed nullfs vnode on mount list.
Lock the native nullfs vnode lock before switching the locks.

Tested by:	pho
MFC after:	1 week
2012-03-02 09:48:46 +00:00
Rick Macklem
4cf7d12840 Fix the NFS clients so that they use copyin() instead of bcopy(),
when doing direct I/O. This direct I/O code is not enabled by default.

Submitted by:	kib (earlier version)
Reviewed by:	kib
MFC after:	1 week
2012-03-01 03:53:07 +00:00
Martin Matuska
b362f0a78b Add "export" to devfs_opts[] and return EOPNOTSUPP if called with it.
Fixes mountd warnings.

Reported by:	kib
MFC after:	1 week
2012-02-29 16:16:36 +00:00
Konstantin Belousov
37a1046e61 Allow shared locks for reads when lower filesystem accept shared locking.
Tested by:	pho
MFC after:	1 week
2012-02-29 15:18:53 +00:00
Konstantin Belousov
cec1d07726 Document that null_nodeget() cannot take shared-locked lowervp due to
insmntque() requirements.

Tested by:	pho
MFC after:	1 week
2012-02-29 15:18:04 +00:00
Konstantin Belousov
409b12c08a In null_reclaim(), assert that reclaimed vnode is fully constructed,
instead of accepting half-constructed vnode. Previous code cannot decide
what to do with such vnode anyway, and although processing it for hash
removal, paniced later when getting rid of nullfs reference on lowervp.

While there, remove initializations from the declaration block.

Tested by:	pho
MFC after:	1 week
2012-02-29 15:15:36 +00:00
Konstantin Belousov
e4e1d9f382 Always request exclusive lock for the lower vnode in nullfs_vget().
The null_nodeget() requires exclusive lock on lowervp to be able to
insmntque() new vnode.

Reported by:	rea
Tested by:	pho
MFC after:	1 week
2012-02-29 15:09:20 +00:00
Konstantin Belousov
67e3d54f80 Move the code to destroy half-contructed nullfs vnode into helper
function null_destroy_proto() from null_insmntque_dtr(). Also
apply null_destroy_proto() in null_nodeget() when we raced and a vnode
is found in the hash, so the currently allocated protonode shall be
destroyed.

Lock the vnode interlock around reassigning the v_vnlock.

In fact, this path will not be exercised after several later commits,
since null_nodeget() cannot take shared-locked lowervp at all due to
insmntque() requirements.

Reported by:	rea
Tested by:	pho
MFC after:	1 week
2012-02-29 15:06:00 +00:00
Konstantin Belousov
da732fc69f Merge a split multi-line comment.
MFC after:	1 week
2012-02-29 14:43:27 +00:00
Martin Matuska
41c0675e6e Add procfs to jail-mountable filesystems.
Reviewed by:	jamie
MFC after:	1 week
2012-02-29 00:30:18 +00:00
Kevin Lo
19b029a487 Remove an unused structure and unnecessary cast 2012-02-24 07:30:44 +00:00
Kevin Lo
a61d3d5a99 Check if the user has necessary permissions on the device 2012-02-24 07:29:06 +00:00
Martin Matuska
bf3db8aa65 To improve control over the use of mount(8) inside a jail(8), introduce
a new jail parameter node with the following parameters:

allow.mount.devfs:
	allow mounting the devfs filesystem inside a jail

allow.mount.nullfs:
	allow mounting the nullfs filesystem inside a jail

Both parameters are disabled by default (equals the behavior before
devfs and nullfs in jails). Administrators have to explicitly allow
mounting devfs and nullfs for each jail. The value "-1" of the
devfs_ruleset parameter is removed in favor of the new allow setting.

Reviewed by:	jamie
Suggested by:	pjd
MFC after:	2 weeks
2012-02-23 18:51:24 +00:00
Kip Macy
11ac7ec076 merge pipe and fifo implementations
Also reviewed by: jhb, jilles (initial revision)
Tested by: pho, jilles

Submitted by:	gianni
Reviewed by:	bde
2012-02-23 18:37:30 +00:00
Rick Macklem
7cfce7cec7 hrs@ reported a panic to freebsd-stable@ under the subject line
"panic in 8.3-PRERELEASE" on Feb. 22, 2012. This panic was caused
by use of a mix of tsleep() and msleep() calls on the same event
in the new NFS server DRC code. It did "mtx_unlock(); tsleep();"
in two places, which kib@ noted introduced a slight risk that the
wakeup() would occur before the tsleep(), resulting in a 10sec
delay before waking up. This patch fixes the problem by replacing
"mtx_unlock(); tsleep();" with mtx_sleep(..PDROP..). It also
changes a nfsmsleep() call to mtx_sleep() so that the code uses
mtx_sleep() consistently within the file.

Tested by:	hrs (in progress)
Reviewed by:	jhb
MFC after:	5 days
2012-02-23 16:47:05 +00:00
Konstantin Belousov
2aacee7779 Use DOINGASYNC() to test for async allowance, to honor VFS syncing requests.
Noted by:	bde
MFC after:	1 week
2012-02-22 13:01:17 +00:00
Konstantin Belousov
526d0bd547 Fix found places where uio_resid is truncated to int.
Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the
sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from
the usermode.

Discussed with:	bde, das (previous versions)
MFC after:	1 month
2012-02-21 01:05:12 +00:00
Kevin Lo
3d74f18ba4 Remove an unnecessary cast. 2012-02-20 09:56:14 +00:00
Bjoern A. Zeeb
9dba179d5e IFC @231845
Sponsored by:	Cisco Systems, Inc.
2012-02-17 00:27:48 +00:00
Rick Macklem
13b2772f8e Delete a couple of out of date comments that are no longer true in
the new NFS client.

Requested by:	bde
MFC after:	1 week
2012-02-16 02:19:53 +00:00
Tijl Coosemans
0662ee9826 Replace PRIdMAX with "jd" in a printf call. Cast the corresponding value to
intmax_t instead of uintmax_t, because the original type is off_t.
2012-02-14 11:24:24 +00:00
Ed Schouten
8fac9b7b7d Merge si_name and __si_namebuf.
The si_name pointer always points to the __si_namebuf member inside the
same object. Remove it and rename __si_namebuf to si_name.
2012-02-10 12:40:50 +00:00
Martin Matuska
61f0e25abf Allow mounting nullfs(5) inside jails.
This is now possible thanks to r230129.

MFC after:	1 month
2012-02-09 10:39:01 +00:00
Martin Matuska
0cc207a6f5 Add support for mounting devfs inside jails.
A new jail(8) option "devfs_ruleset" defines the ruleset enforcement for
mounting devfs inside jails. A value of -1 disables mounting devfs in
jails, a value of zero means no restrictions. Nested jails can only
have mounting devfs disabled or inherit parent's enforcement as jails are
not allowed to view or manipulate devfs(8) rules.

Utilizes new functions introduced in r231265.

Reviewed by:	jamie
MFC after:	1 month
2012-02-09 10:22:08 +00:00
Martin Matuska
17d84d611f Introduce the "ruleset=number" option for devfs(5) mounts.
Add support for updating the devfs mount (currently only changing the
ruleset number is supported).
Check mnt_optnew with vfs_filteropt(9).

This new option sets the specified ruleset number as the active ruleset
of the new devfs mount and applies all its rules at mount time. If the
specified ruleset doesn't exist, a new empty ruleset is created.

MFC after:	1 month
2012-02-09 10:09:12 +00:00
Pedro F. Giffuni
3cc6ae1f57 Update the data structures with some fields reserved for
ext4 but that can be used in ext3 mode.

Also adjust the internal inode to carry the birthtime,
like in UFS, which is starting to get some use when
big inodes are available.

Right now these are just placeholders for features
to come.

Approved by:	jhb (mentor)
MFC after:	2 weeks
2012-02-07 22:31:28 +00:00