3062 Commits

Author SHA1 Message Date
kib
b8fe5eca7f The tmpfs_alloc_vp() is used to instantiate vnode for the tmpfs node,
in particular, from the tmpfs_lookup VOP method.  If LK_NOWAIT is not
specified in the lkflags, the lookup is supposed to return an alive
vnode whenever the underlying node is valid.

Currently, the tmpfs_alloc_vp() returns ENOENT if the vnode attached
to node exists and is being reclaimed.  This causes spurious ENOENT
errors from lookup on tmpfs and corresponding random 'No such file'
failures from syscalls working with tmpfs files.

Fix this by waiting for the doomed vnode to be detached from the tmpfs
node if sleepable allocation is requested.

Note that filesystems which use vfs_hash.c, correctly handle the case
due to vfs_hash_get() looping when vget() returns ENOENT for sleepable
requests.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2013-08-05 18:53:59 +00:00
attilio
899ab64514 Revert r253939:
We cannot busy a page before doing pagefaults.
Infact, it can deadlock against vnode lock, as it tries to vget().
Other functions, right now, have an opposite lock ordering, like
vm_object_sync(), which acquires the vnode lock first and then
sleeps on the busy mechanism.

Before this patch is reinserted we need to break this ordering.

Sponsored by:	EMC / Isilon storage division
Reported by:	kib
2013-08-05 08:55:35 +00:00
attilio
19b2ea9f81 The page hold mechanism is fast but it has couple of fallouts:
- It does not let pages respect the LRU policy
- It bloats the active/inactive queues of few pages

Try to avoid it as much as possible with the long-term target to
completely remove it.
Use the soft-busy mechanism to protect page content accesses during
short-term operations (like uiomove_fromphys()).

After this change only vm_fault_quick_hold_pages() is still using the
hold mechanism for page content access.
There is an additional complexity there as the quick path cannot
immediately access the page object to busy the page and the slow path
cannot however busy more than one page a time (to avoid deadlocks).

Fixing such primitive can bring to complete removal of the page hold
mechanism.

Sponsored by:	EMC / Isilon storage division
Discussed with:	alc
Reviewed by:	jeff
Tested by:	pho
2013-08-04 21:07:24 +00:00
attilio
e825889721 Remove unnecessary soft busy of the page before to do vn_rdwr() in
kern_sendfile() which is unnecessary.
The page is already wired so it will not be subjected to pagefault.
The content cannot be effectively protected as it is full of races
already.
Multiple accesses to the same indexes are serialized through vn_rdwr().

Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc, jeff
Tested by:	pho
2013-08-04 15:56:19 +00:00
pfg
f35ac11d77 Add license for the half MD4 algorithm used in ext2_half_md4().
The htree implementation uses code derived from the
RSA Data Security, Inc. MD4 Message-Digest Algorithm.

Add a proper licensing statement for the code and clarify
the corresponding comments.

Approved by:	core (hrs)
2013-08-01 16:04:48 +00:00
marius
335ec6e6b3 - Add const-qualifiers to the arguments of isonum_*().
- According to ISO 9660 7.1.2, isonum_712() should return a signed value.
- Try to get isonum_*() closer to style(9).
2013-07-28 12:29:10 +00:00
avg
4b4c561bbf make path matching in devfs rules consistent and sane (and safer)
Before this change path matching had the following features:
- for device nodes the patterns were matched against full path
- in the above case '/' in a path could be matched by a wildcard
- for directories and links only the last component was matched

So, for example, a pattern like 're*' could match the following entries:
- re0 device
- responder/u0 device
- zvol/recpool directory

Although it was possible to work around this behavior (once it was spotted
and understood), it was very confusing and contrary to documentation.

Now we always match a full path for all types of devfs entries (devices,
directories, links) and a '/' has to be matched explicitly.
This behavior follows the shell globbing rules.

This change is originally developed by Jaakko Heinonen.
Many thanks!

PR:		kern/122838
Submitted by:	jh
MFC after:	4 weeks
2013-07-26 14:25:58 +00:00
pfg
d1aa5826fa ext2fs: Return EINVAL for negative uio_offset as in UFS.
While here drop old comment that doesn't really apply.

MFC after:	1 month
Discussed with:	gleb
2013-07-25 19:37:49 +00:00
pfg
b4c061ba11 ext2fs: Drop a check that wan't supposed to be in r253651.
MFC after:	1 month
2013-07-25 16:04:55 +00:00
pfg
3cce3ff568 ext2fs: Don't assume that on-disk format of a directory is the same
as in <sys/dirent.h>

ext2_readdir() has always been very fs specific and different
with respect to its ufs_ counterpart. Recent changes from UFS
have made it possible to share more closely the implementation.

MFUFS r252438:
Always start parsing at DIRBLKSIZ aligned offset, skip first entries if
uio_offset is not DIRBLKSIZ aligned. Return EINVAL if buffer is too
small for single entry.

Preallocate buffer for cookies.

Skip entries with zero inode number.

Reviewed by:	gleb, Zheng Liu
MFC after:	1 month
2013-07-25 15:34:20 +00:00
pfg
c92369e15f fuse: revert kernel_header update.
It seems to be causing problems due to the lack of the new features.

Found by:	bapt
Pointed hat:	pfg
2013-07-24 20:21:29 +00:00
nwhitehorn
697ec4c8f3 tmpfs works perfectly fine with -o union -- there is no reason to exclude it
from the list of options.
2013-07-23 14:48:37 +00:00
rmacklem
51bcea9c2c The NFSv4 server incorrectly assumed that the high order words of
the attribute bitmap argument would be non-zero. This caused an
interoperability problem for a recent patch to the Linux NFSv4 client.
The Linux folks have changed their patch to avoid this, but this
patch fixes the problem on the server.

Reported and tested by:	Andre Heider (a.heider@gmail.com)
MFC after:	3 days
2013-07-20 22:35:32 +00:00
pfg
7d8d22f309 fuse: revert birthtime support.
The creation time support breaks the data structures used in linux
fuse.  libfuse carries it's own header.

Revert the changes for now. We will try to get an agreement with the
fuse  upstream maintainers to avoid having to patch the library
headers all the time.
2013-07-20 14:50:35 +00:00
pfg
ee86b8d348 Adjust outsizes:
Recalculate FUSE_COMPAT_ENTRY_OUT_SIZE and COMPAT_ATTR_OUT_SIZE.
These were wrong in the previous commit. They are actually unused
in FreeBSD though.

Pointed out by:	Jan Beich
2013-07-20 03:55:56 +00:00
pfg
2a02ab82bb Adjust outsizes:
When birthtime was added (r253331) we missed adding the weight
of the new fields in FUSE_COMPAT_ENTRY_OUT_SIZE and
COMPAT_ATTR_OUT_SIZE. Adjust them accordingly.

Pointed out by:	Jan Beich
2013-07-20 03:08:50 +00:00
pfg
2bfbfc539c Update fuse_kernel header.
Bring in the changes from the FUSE kernel interface 7.10
(available under a BSD license).

After 7.10 the linux FUSE developers added support for a
controversial CUSE driver and some linux especific
features that are unlikely to find its way into FreeBSD.

We currently don't implement any of the new features so we
are *not* bumping the FUSE_KERNEL_MINOR_VERSION. The header
should, nevertheless, serve  as a template to add the new
features in a compatible manner.

While here adopt some minor cleanups from the upstream version
like removing FUSE_MAJOR and FUSE_MINOR which were never
used. Also add multiple inclusion header guards,
2013-07-15 00:05:27 +00:00
pfg
e26f629724 Add creation timestamp (birthtime) support for fuse.
I was keeping this #ifdef'd for reference with the MacFUSE change[1]
but on second thought, this is a FreeBSD-only header so the SVN
history should be enough.

Add missing padding while here.

Reference [1]:
http://code.google.com/p/macfuse/source/detail?spec=svn1686&r=1360
2013-07-13 22:06:41 +00:00
pfg
a51cca1f05 Add creation timestamp (birthtime) support for fuse.
This is based on similar support in MacFUSE.
2013-07-12 17:22:59 +00:00
pfg
be085216cb Implement 1003.1-2001 pathconf() keys.
This is based on r106058 in UFS.

MFC after:	1 month
2013-07-10 22:03:01 +00:00
pfg
5a396d8c5b Reinstate the assertion from r253045.
UFS r232732 reverted the change as the real problem was to be fixed
at the syscall level.

Reported by:	bde
2013-07-09 14:23:00 +00:00
pfg
6ca7bf7cd4 Enhancement when writing an entire block of a file.
Merge from UFS r231313:

This change first attempts the uiomove() to the newly allocated
(and dirty) buffer and only zeros it if the uiomove() fails. The
effect is to eliminate the gratuitous zeroing of the buffer in
the usual case where the uiomove() successfully fills it.

MFC after:	3 days
2013-07-09 01:31:04 +00:00
rmacklem
4050bd5a8c Add support for host-based (Kerberos 5 service principal) initiator
credentials to the kernel rpc. Modify the NFSv4 client to add
support for the gssname and allgssname mount options to use this
capability. Requires the gssd daemon to be running with the "-h" option.

Reviewed by:	jhb
2013-07-09 01:05:28 +00:00
pfg
c91c27e5e0 Avoid a panic and return EINVAL instead.
Merge from UFS r232692:
syscall() fuzzing can trigger this panic.

MFC after:	3 days
2013-07-08 20:21:36 +00:00
pfg
6e9d91d26b Implement SEEK_HOLE/SEEK_DATA for ext2fs.
Merged from r236044 on UFS.

MFC after:	3 days
2013-07-07 15:51:28 +00:00
pfg
512e5343cf Fix some typos.
MFC after:	1 week
2013-07-07 01:32:52 +00:00
pfg
73bb131f84 Initial implementation of the HTree directory index.
This is a port of NetBSD's GSoC 2012 Ext3 HTree directory indexing
by Vyacheslav Matyushin.  It was cleaned up and enhanced for FreeBSD
by Zheng Liu (lz@).

This is an excellent example of work shared among different projects:
Vyacheslav was able to look at an early prototype from Zheng Liu who
was also able to check the code from Haiku (with permission).

As in linux, the feature is not available by default and must be
enabled explicitly with tune2fs. We still do not support the
workarounds required in readdir for NFS.

Submitted by:	Zheng Liu
Tested by:	Mike Ma
Sponsored by:	Google Inc.
MFC after:	1 week
2013-07-06 18:28:06 +00:00
kib
e9befff373 The tvp vnode on rename is usually unlinked. Drop the cached null
vnode for tvp to allow the free of the lower vnode, if needed.

PR:	kern/180236
Tested by:	smh
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2013-07-04 19:01:18 +00:00
davide
6321e04894 - Fix double frees/user after free.
- Allocate using smb_rq_alloc() instead of inlining it.

Reported by:	uqs
Found with:	Coverity Scan
2013-07-03 10:31:45 +00:00
rmacklem
8b2d48c78b A problem with the old NFS client where large writes to large files
would sometimes result in a corrupted file was reported via email.
This problem appears to have been caused by r251719 (reverting
r251719 fixed the problem). Although I have not been able to
reproduce this problem, I suspect it is caused by another thread
increasing np->n_size after the mtx_unlock(&np->n_mtx) but before
the vnode_pager_setsize() call. Since the np->n_mtx mutex serializes
updates to np->n_size, doing the vnode_pager_setsize() with the
mutex locked appears to avoid the problem.
Unfortunately, vnode_pager_setsize() where the new size is smaller,
cannot be called with a mutex held.
This patch returns the semantics to be close to pre-r251719 (actually
pre-r248567, r248581, r248567 for the new client) such that the call to
vnode_pager_setsize() is only delayed until after the mutex is
unlocked when np->n_size is shrinking. Since the file is growing
when being written, I believe this will fix the corruption.
A better solution might be to replace the mutex with a sleep lock,
but that is a non-trivial conversion, so this fix is hoped to be
sufficient in the meantime.

Reported by:	David G. Lawrence (dg@dglawrence.com)
Tested by:	David G. Lawrence (to be done soon)
Reviewed by:	kib
MFC after:	1 week
2013-07-03 00:19:03 +00:00
pfg
b08e5aff18 ext2fs: Use the complete random() range in i_gen.
i_gen is unsigned in ext2fs so we can handle the complete
32 bits.

MFC after:	1 week
2013-06-30 00:42:51 +00:00
pfg
7e708941c4 Bring some updates from ufs_lookup to ext2fs.
r156418:

Don't set IN_CHANGE and IN_UPDATE on inodes for potentially suspended
file systems.  This could cause deadlocks when creating snapshots.
(We can't do snapshots on ext2fs but it is useful to keep things in sync).

r183079:

- Only set i_offset in the parent directory's i-node during a lookup for
  non-LOOKUP operations.
- Relax a VOP assertion for a DELETE lookup.

r187528:

Move the code from ufs_lookup.c used to do dotdot lookup, into
the helper function. It is supposed to be useful for any filesystem
that has to unlock dvp to walk to the ".." entry in lookup routine.

MFC after:	5 days
2013-06-29 01:35:28 +00:00
davide
a0c5d96b0a Properly use v_data field. This magically worked (even if wrong) until
now because v_data is the first field of the structure, but it's not
something we should rely on.
2013-06-28 20:32:48 +00:00
davide
a47b6265f2 Garbage collect an useless check. smp should be never NULL. 2013-06-28 20:14:30 +00:00
davide
cf63c799c6 Plug a couple of leakages in smbfs_lookup(). 2013-06-28 20:07:24 +00:00
pfg
ef282086af Minor sorting.
MFC after:	3 days
2013-06-26 19:43:22 +00:00
pfg
456a58318f Define and use e2fs_lbn_t in ext2fs.
In line to what is done in UFS, define an internal type
e2fs_lbn_t for the logical block numbers.

This change is basically a no-op as the new type is unchanged
(int32_t) but it may be useful as bumping this may be required
for ext4fs.

Also, as pointed out by Bruce Evans:

-Use daddr_t for daddr in ext2_bmaparray(). This seems to
improve reliability with the reallocblks option.
- Add a cast to the fsbtodb() macro as in UFS.

Reviewed by:	bde
MFC after:	3 days
2013-06-23 02:44:42 +00:00
rmacklem
198db6adb9 Fix r252074 so that it builds on 64bit arches. 2013-06-22 21:58:21 +00:00
rmacklem
121288ceb2 The NFSv4.1 LayoutCommit operation requires a valid offset and length.
(0, 0 is not sufficient) This patch a loop for each file layout, using
the offset, length of each file layout in a separate LayoutCommit.
2013-06-21 22:46:16 +00:00
rmacklem
b3e84941b0 When the NFSv4.1 client is writing to a pNFS Data Server (DS), the
file's size attribute does not get updated. As such, it is necessary
to invalidate the attribute cache before clearing NMODIFIED for pNFS.

MFC after:	2 weeks
2013-06-21 22:26:18 +00:00
rmacklem
2bc76c9d96 Since some NFSv4 servers enforce the requirement for a reserved port#,
enable use of the (no)resvport mount option for NFSv4. I had thought
that the RFC required that non-reserved port #s be allowed, but I couldn't
find it in the RFC.

MFC after:	2 weeks
2013-06-21 19:41:30 +00:00
pfg
ef31b55ff7 Rename some prefixes in the Block Group Descriptor fields to ext4bgd_
Change prefix to avoid confusion and denote that these fields
are generally only available starting with ext4.

MFC after:	3 days
2013-06-20 00:00:33 +00:00
pfg
9ac60dd284 More ext2fs header cleanups:
- Set MAXMNTLEN nearer to where it is used.
- Move EXT2_LINK_MAX to ext2_dir.h .

MFC after:	3 days
2013-06-18 15:49:30 +00:00
pfg
fb29984476 Rename remaining DIAGNOSTIC to INVARIANTS.
MFC after:	3 days
2013-06-17 00:39:23 +00:00
pfg
5cf909a854 Re-sort ext2fs headers to make things easier to find.
In the ext2fs driver we have a mixture of headers:

- The ext2_ prefixed headers have strong influence from NetBSD
and are carry specific ext2/3/4 information.
- The unprefixed headers are inspired on UFS and carry implementation
specific information.

Do some small adjustments so that the information is easier to
find coming from either UFS or the NetBSD implementation.

MFC after:	3 days
2013-06-16 16:10:45 +00:00
pfg
4d854ec921 Relax some unnecessary unsigned type changes in ext2fs.
While the changes in r245820 are in line with the ext2 spec,
the code derived from UFS can use negative values so it is
better to relax some types to keep them as they were, and
somewhat more similar to UFS. While here clean some casts.

Some of the original types are still wrong and will require
more work.

Discussed with:	bde
MFC after:	3 days
2013-06-13 03:23:24 +00:00
pfg
e9fce1a99b Turn DIAGNOSTICs to INVARIANTS in ext2fs.
This is done to be consistent with what other filesystems and
particularly ffs already does (see r173464).

MFC after:	5 days
2013-06-12 15:24:48 +00:00
pfg
adb723b809 s/file system/filesystem/g
Based on r96755 from UFS.

MFC after:	3 days
2013-06-11 02:47:07 +00:00
pfg
a8d161091f e2fs_bpg and e2fs_isize are always unsigned.
The superblock in ext2fs defines all the fields as unsigned but for
some reason the in-memory superblock was carrying e2fs_bpg and
e2fs_isize as signed.

We should preserve the specified types for consistency.

MFC after:	5 days
2013-06-09 01:38:51 +00:00
alc
47492525b1 Add missing VM object unlocks in an error case.
Reviewed by:	kib
2013-06-07 19:42:00 +00:00