Commit Graph

2950 Commits

Author SHA1 Message Date
Baptiste Daroussin
3c4223a65a Revert r246791 as it needs a security review first
Reported by:	gavin, rwatson
2013-02-14 15:17:53 +00:00
Baptiste Daroussin
f4365abd91 Allow fdescfs to be mounted from inside a jail
MFC after:	1 week
2013-02-14 13:03:15 +00:00
Pedro F. Giffuni
a9d1b29995 ext2fs: Use prototype declarations for function definitions
Submitted by:	Christoph Mallon
MFC after:	2 weeks
2013-02-10 19:49:37 +00:00
Attilio Rao
5e60cb948e Remove a racy checks on resident and cached pages for
tmpfs_mapped{read, write}() functions:
- tmpfs_mapped{read, write}() are only called within VOP_{READ, WRITE}(),
  which check before-hand to work only on valid VREG vnodes.  Also the
  vnode is locked for the duration of the work, making vnode reclaiming
  impossible, during the operation. Hence, vobj can never be NULL.
- Currently check on resident pages and cached pages without vm object
  lock held is racy and can do even more harm than good, as a page could
  be transitioning between these 2 pools and then be skipped entirely.
  Skip the checks as lookups on empty splay trees are very cheap.

Discussed with:	alc
Tested by:	flo
MFC after:	2 weeks
2013-02-10 01:04:10 +00:00
Pedro F. Giffuni
4e1e0e2582 ext2fs: Replace redundant EXT2_MIN_BLOCK with EXT2_MIN_BLOCK_SIZE.
Submitted by:	Christoph Mallon
MFC after:	2 weeks
2013-02-08 21:09:44 +00:00
Pedro F. Giffuni
1a125d6d85 ext2fs: make e2fs_maxcontig local and remove tautological check.
e2fs_maxcontig was modelled after UFS when bringing the
"Orlov allocator" to ext2. On UFS fs_maxcontig is kept in the
superblock and is used by userland tools (fsck and growfs),

In ext2 this information is volatile so it is not available
for userland tools, so in this case it doesn't have sense
to carry it in the in-memory superblock.

Also remove a pointless check for MAX(1, x) > 0.

Submitted by:	Christoph Mallon
MFC after:	2 weeks
2013-02-08 20:58:00 +00:00
Pedro F. Giffuni
a940ce65cd Remove unused MAXSYMLINKLEN macro.
Reviewed by:	mckusick
PR:		kern/175794
MFC after:	1 week
2013-02-08 20:30:19 +00:00
Konstantin Belousov
2ca4998342 Stop translating the ERESTART error from the open(2) into EINTR.
Posix requires that open(2) is restartable for SA_RESTART.

For non-posix objects, in particular, devfs nodes, still disable
automatic restart of the opens. The open call to a driver could have
significant side effects for the hardware.

Noted and reviewed by:	jilles
Discussed with:	bde
MFC after:	2 weeks
2013-02-07 14:53:33 +00:00
John Baldwin
a120a7a3cd Rework the handling of stop signals in the NFS client. The changes in
195702, 195703, and 195821 prevented a thread from suspending while holding
locks inside of NFS by forcing the thread to fail sleeps with EINTR or
ERESTART but defer the thread suspension to the user boundary.  However,
this had the effect that stopping a process during an NFS request could
abort the request and trigger EINTR errors that were visible to userland
processes (previously the thread would have suspended and completed the
request once it was resumed).

This change instead effectively masks stop signals while in the NFS client.
It uses the existing TDF_SBDRY flag to effect this since SIGSTOP cannot
be masked directly.  Also, instead of setting PBDRY on individual sleeps,
the NFS client now sets the TDF_SBDRY flag around each NFS request and
stop signals are masked for all sleeps during that region (the previous
change missed sleeps in lockmgr locks).  The end result is that stop
signals sent to threads performing an NFS request are completely
ignored until after the NFS request has finished processing and the
thread prepares to return to userland.  This restores the behavior of
stop signals being transparent to userland processes while still
preventing threads from suspending while holding NFS locks.

Reviewed by:	kib
MFC after:	1 month
2013-02-06 17:06:51 +00:00
Pedro F. Giffuni
d427334435 ext2fs: move assignment where it is not dead.
Submitted by:	Christoph Mallon
MFC after:	2 weeks
2013-02-05 03:26:34 +00:00
Pedro F. Giffuni
80b6a61199 ext2fs: Remove unused em_e2fsb definition..
Submitted by:	Christoph Mallon
MFC after:	2 weeks
2013-02-05 03:23:56 +00:00
Pedro F. Giffuni
555368dcf1 ext2fs: Remove useless rootino local variable.
Submitted by:	Christoph Mallon
MFC after:	2 weeks
2013-02-05 03:17:41 +00:00
Pedro F. Giffuni
ef024b0da9 ext2fs: Correct off-by-one errors in FFTODT() and DDTOFT().
Submitted by:	Christoph Mallon
MFC after:	2 weeks
2013-02-05 03:13:05 +00:00
Pedro F. Giffuni
666116e4a3 ext2fs: Use nitems().
Submitted by:	Christoph Mallon
MFC after:	2 weeks
2013-02-05 03:08:56 +00:00
Pedro F. Giffuni
fdc100e4c2 ext2fs: Use EXT2_LINK_MAX instead of LINK_MAX
Submitted by:	Christoph Mallon
MFC after:	2 weeks
2013-02-05 03:01:04 +00:00
Pedro F. Giffuni
757224cbdb ext2fs: general cleanup.
- Remove unused extern declarations in fs.h
- Correct comments in ext2_dir.h
- Several panic() messages showed wrong function names.
- Remove commented out stray line in ext2_alloc.c.
- Remove the unused macro EXT2_BLOCK_SIZE_BITS() and the then
  write-only member e2fs_blocksize_bits from struct m_ext2fs.
- Remove the unused macro EXT2_FIRST_INO() and the then write-only
  member e2fs_first_inode from struct m_ext2fs.
- Remove EXT2_DESC_PER_BLOCK() and the member e2fs_descpb from
  struct m_ext2fs.
- Remove the unused members e2fs_bmask, e2fs_dbpg and
  e2fs_mount_opt from struct m_ext2fs
- Correct harmless off-by-one error for fspath in ext2_vfsops.c.
- Remove the unused and broken macros EXT2_ADDR_PER_BLOCK_BITS()
  and EXT2_DESC_PER_BLOCK_BITS().
- Remove the !_KERNEL versions of the EXT2_* macros.

Submitted by:	Christoph Mallon
MFC after:	2 weeks
2013-02-02 22:23:45 +00:00
Konstantin Belousov
11fca81ccd The MSDOSFSMNT_WAITONFAT flag is bogus and broken. It does less than
track the MNT_SYNCHRONOUS flag.  It is set to the latter at mount time
but not updated by MNT_UPDATE.

Use MNT_SYNCHRONOUS to decide to write the FAT updates syncrhonously.

Submitted by:	bde
MFC after:	1 week
2013-02-01 18:30:41 +00:00
Konstantin Belousov
79fb7dd167 Backup FATs were sometimes marked dirty by copying their first block
from the primary FAT, and then they were not marked clean on unmount.
Force marking them clean when appropriate.

Submitted by:	bde
MFC after:	1 week
2013-02-01 18:25:53 +00:00
Konstantin Belousov
9ec062dddd The directory entry for dotdot was corrupted in the FAT32 case when moving
a directory to a subdir of the root directory from somewhere else.

For all directory moves that change the parent directory, the dotdot
entry must be fixed up.  For msdosfs, the root directory is magic for
non-FAT32.  It is less magic for FAT32, but needs the same magic for
the dotdot fixup.  It didn't have it.

Both chkdsk and fsck_msdosfs fix the corrupt directory entries with no
problems.

The fix is to use the same magic for dotdot in msdosfs_rename() as in
msdosfs_mkdir().

For msdosfs_mkdir(), document the magic. When writing the dotdot entry
in mkdir, use explicitly set pcl variable instead on relying on the
start cluster of the root directory typically has a value < 65536.

Submitted by:	bde
MFC after:	1 week
2013-02-01 18:06:06 +00:00
Konstantin Belousov
48efa33b49 The mountmsdosfs() function had an insane sanity test, remove it.
Trying FAT32 on a small partition failed to mount because
pmp->pm_Sectors was nonzero.  Normally, FAT32 file systems are so
large that the 16-bit pm_Sectors can't hold the size.  This is
indicated by setting it to 0 and using only pm_HugeSectors.  But at
least old versions of newfs_msdos use the 16-bit field if possible,
and msdosfs supports this except for breaking its own support in the
sanity check.  This is quite different from the handling of pm_FATsecs
-- now the 16-bit value is always ignored for FAT32 except for
checking that it is 0, and newfs_msdos doesn't use the 16-bit value
for FAT32.

Submitted by:	bde
MFC after:	1 week
2013-02-01 18:01:03 +00:00
Konstantin Belousov
a26b949f2d Fix a backwards comment in markvoldirty().
Submitted by:	bde
MFC after:	1 week
2013-02-01 17:58:37 +00:00
Konstantin Belousov
dd6035234a Assert that the mbuf in the chain has sane length. Proper place for
this check is somewhere in the network code, but this assertion
already proven to be useful in catching what seems to be driver bugs
causing NFS scrambling random memory.

Discussed with:	rmacklem
MFC after:	1 week
2013-02-01 16:57:02 +00:00
Konstantin Belousov
6168020f66 Be conservative and do not try to consume more bytes than was
requested from the server for the read operation.  Server shall not
reply with too large size, but client should be resilent too.

Reviewed by:	rmacklem
MFC after:	1 week
2013-01-27 09:34:25 +00:00
Pedro F. Giffuni
646a7fea0c Clean some 'svn:executable' properties in the tree.
Submitted by:	Christoph Mallon
MFC after:	3 days
2013-01-26 22:08:21 +00:00
Pedro F. Giffuni
879aeda7b6 Cosmetical off-by-one
Technically, the case when all the blocks are released
is not a sanity check.
Move further the comment while here.

Suggested by:	bde
MFC after:	3 days
2013-01-26 21:50:52 +00:00
John Baldwin
a89a2c8ba4 Further cleanups to use of timestamps in NFS:
- Use NFSD_MONOSEC (which maps to time_uptime) instead of the seconds
  portion of wall-time stamps to manage timeouts on events.
- Remove unused nd_starttime from the per-request structure in the new
  NFS server.
- Use nanotime() for the modification time on a delegation to get as
  precise a time as possible.
- Use time_second instead of extracting the second from a call to
  getmicrotime().

Submitted by:	bde (3)
Reviewed by:	bde, rmacklem
MFC after:	2 weeks
2013-01-25 15:25:24 +00:00
Pedro F. Giffuni
69017f8d8c ext2fs: fix a check for negative block numbers.
The previous change accidentally left the substraction we
were trying to avoid in case that i_blocks could become
negative.

Reported by:	bde
MFC after:	4 days
2013-01-23 14:29:29 +00:00
Pedro F. Giffuni
1d04725a7a ext2fs: make some inode fields match the ext2 spec.
Ext2fs uses unsigned fields in its dinode struct.
FreeBSD can have negative values in some of those
fields and the inode is meant to interact with the
system so we have never respected the unsigned
nature of most of those fields.

Block numbers and the NFS generation number do
not need to be signed so redefine them as
unsigned to better match the on-disk information.

MFC after:	1 week
2013-01-22 18:54:03 +00:00
Pedro F. Giffuni
4b21c8fda9 ext2fs: temporarily disable the reallocation code.
Testing with fsx has revealed problems and in order to
hunt the bugs properly we need reduce the complexity.

This seems to help but is not a complete solution.

MFC after:	3 days
2013-01-22 18:36:31 +00:00
Xin LI
e4558aacfc Make it possible to force async at server side on new NFS server, similar
to the old one's nfs.nfsrv.async.

Please note that by enabling this option (default is disabled), the system
could potentionally have silent data corruption if the server crashes
before write is committed to non-volatile storage, as the client side have
no way to tell if the data is already written.

Submitted by:	rmacklem
MFC after:	2 weeks
2013-01-18 19:42:08 +00:00
Pedro F. Giffuni
b187108f66 ext2fs: Add some DOINGASYNC check to match ffs.
This is mostly cosmetical.

Reviewed by:	bde
MFC after:	3 days
2013-01-18 19:11:17 +00:00
John Baldwin
d177f14da9 Use vfs_timestamp() to set file timestamps rather than invoking
getmicrotime() or getnanotime() directly in NFS.

Reviewed by:	rmacklem, bde
MFC after:	1 week
2013-01-18 18:43:38 +00:00
John Baldwin
39804bc89d Remove a no-longer-used variable after the previous change to use
VA_UTIMES_NULL.

Submitted by:	bde, rmacklem
MFC after:	1 week
2013-01-17 18:45:20 +00:00
John Baldwin
5055536eec Use the VA_UTIMES_NULL flag to detect when NULL was passed to utimes()
instead of comparing the desired time against the current time as a
heuristic.

Reviewed by:	rmacklem
MFC after:	1 week
2013-01-16 21:52:31 +00:00
Konstantin Belousov
e8f966eeb8 Remove the filtering of the acceptable mount options for nullfs, added
in r245004.  Although the report was for noatime option which is
non-functional for the nullfs, other standard options like nosuid or
noexec are useful with it.

Reported by:	Dewayne Geraghty <dewayne.geraghty@heuristicsystems.com.au>
MFC after:	3 days
2013-01-16 05:32:49 +00:00
John Baldwin
6910d7a0d8 - More properly handle interrupted NFS requests on an interruptible mount
by returning an error of EINTR rather than EACCES.
- While here, bring back some (but not all) of the NFS RPC statistics lost
  when krpc was committed.

Reviewed by:	rmacklem
MFC after:	1 week
2013-01-15 22:08:17 +00:00
Konstantin Belousov
603f963e56 The current default size of the nullfs hash table used to lookup the
existing nullfs vnode by the lower vnode is only 16 slots.  Since the
default mode for the nullfs is to cache the vnodes, hash has extremely
huge chains.

Size the nullfs hashtbl based on the current value of
desiredvnodes. Use vfs_hash_index() to calculate the hash bucket for a
given vnode.

Pointy hat to:	    kib
Diagnosed and reviewed by:	peter
Tested by:    peter, pho (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	5 days
2013-01-14 05:44:47 +00:00
Konstantin Belousov
6b17595133 When nullfs mount is forcibly unmounted and nullfs vnode is reclaimed,
get back the leased write reference from the lower vnode.  There is no
other path which can correct v_writecount on the lowervp.

Reported by:	flo
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2013-01-10 18:24:48 +00:00
Baptiste Daroussin
3d94054c30 Add support for IO_APPEND flag in fuse
This make open(..., O_APPEND) actually works on fuse filesystem.

Reviewed by:	attilio
2013-01-08 12:21:50 +00:00
Pedro F. Giffuni
98f5c0d41a ext2fs: cleanup de dinode structure.
It was plagued with style errors and the offsets had been lost.
While here took the time to update the fields according to the
latest ext4 documentation.

Reviewed by:	bde
MFC after:	3 days
2013-01-07 03:36:32 +00:00
Gleb Kurtsou
4fd5efe79e tmpfs: Replace directory entry linked list with RB-Tree.
Use file name hash as a tree key, handle duplicate keys.  Both VOP_LOOKUP
and VOP_READDIR operations utilize same tree for search.  Directory
entry offset (cookie) is either file name hash or incremental id in case
of hash collisions (duplicate-cookies).  Keep sorted per directory list
of duplicate-cookie entries to facilitate cookie number allocation.

Don't fail if previous VOP_READDIR() offset is no longer valid, start
with next dirent instead.  Other file system handle it similarly.

Workaround race prone tn_readdir_last[pn] fields update.

Add tmpfs_dir_destroy() to free all dirents.

Set NFS cookies in tmpfs_dir_getdents(). Return EJUSTRETURN from
tmpfs_dir_getdents() instead of hard coded -1.

Mark directory traversal routines static as they are no longer
used outside of tmpfs_subr.c
2013-01-06 22:15:44 +00:00
Konstantin Belousov
268dd286a0 Fix reversed condition in the assertion.
Pointy hat to:	kib
MFC after:	13 days
2013-01-04 07:52:47 +00:00
Konstantin Belousov
9cf4c952ca Add the "nocache" nullfs mount option, which disables the caching of
the free nullfs vnodes, switching nullfs behaviour to pre-r240285.
The option is mostly intended as the last-resort when higher pressure
on the vnode cache due to doubling of the vnode counts is not
desirable.

Note that disabling the cache costs more than 2x wall time in the
metadata-hungry scenarious.  The default is "cache".

Tested and benchmarked by:	pho (previous version)
MFC after:	2 weeks
2013-01-03 19:17:57 +00:00
Konstantin Belousov
6b54784391 Remove the last use of the deprecated MNT_VNODE_FOREACH interface in
the tree.

With the help from:	mjg
Tested by:	Ronald Klop <ronald-freebsd8@klop.yi.org>
MFC after:	2 weeks
2013-01-03 19:01:56 +00:00
Konstantin Belousov
ad9789f6db Do not force a writer to the devfs file to drain the buffer writes.
Requested and tested by:	Ian Lepore <freebsd@damnhippie.dyndns.org>
MFC after:	2 weeks
2012-12-23 22:43:27 +00:00
Pedro F. Giffuni
e28f5d5222 More constant renaming in preparation for newer features.
We also try to make better use of the fs flags instead of
trying adapt the code according to the fs structures. In
the case of subsecond timestamps and birthtime we now
check that the feature is explicitly enabled: previously
we only checked that the reserved space was available and
silently wrote them.

This approach is much safer, especially if the filesystem
happens to use embedded inodes or support EAs.

Discussed with:	Zheng Liu
MFC after:	5 days
2012-12-20 02:22:36 +00:00
Rick Macklem
ef8f1261d2 Add "nfsstat -m" support for the two new NFS mount options
added by r244042.
2012-12-09 22:23:50 +00:00
Rick Macklem
1f60bfd822 Move the NFSv4.1 client patches over from projects/nfsv4.1-client
to head. I don't think the NFS client behaviour will change unless
the new "minorversion=1" mount option is used. It includes basic
NFSv4.1 support plus support for pNFS using the Files Layout only.
All problems detecting during an NFSv4.1 Bakeathon testing event
in June 2012 have been resolved in this code and it has been tested
against the NFSv4.1 server available to me.
Although not reviewed, I believe that kib@ has looked at it.
2012-12-08 22:52:39 +00:00
Gleb Smirnoff
eb1b1807af Mechanically substitute flags from historic mbuf allocator with
malloc(9) flags within sys.

Exceptions:

- sys/contrib not touched
- sys/mbuf.h edited manually
2012-12-05 08:04:20 +00:00
Rick Macklem
99d2727d67 Add an nfssvc() option to the kernel for the new NFS client
which dumps out the actual options being used by an NFS mount.
This will be used to implement a "-m" option for nfsstat(1).

Reviewed by:	alfred
MFC after:	2 weeks
2012-12-02 01:16:04 +00:00