Commit Graph

3035 Commits

Author SHA1 Message Date
Rick Macklem
88a2437a65 Add support for host-based (Kerberos 5 service principal) initiator
credentials to the kernel rpc. Modify the NFSv4 client to add
support for the gssname and allgssname mount options to use this
capability. Requires the gssd daemon to be running with the "-h" option.

Reviewed by:	jhb
2013-07-09 01:05:28 +00:00
Pedro F. Giffuni
7ce75e5f1f Avoid a panic and return EINVAL instead.
Merge from UFS r232692:
syscall() fuzzing can trigger this panic.

MFC after:	3 days
2013-07-08 20:21:36 +00:00
Pedro F. Giffuni
bdf1d79884 Implement SEEK_HOLE/SEEK_DATA for ext2fs.
Merged from r236044 on UFS.

MFC after:	3 days
2013-07-07 15:51:28 +00:00
Pedro F. Giffuni
d66aed2e76 Fix some typos.
MFC after:	1 week
2013-07-07 01:32:52 +00:00
Pedro F. Giffuni
91f5a4670f Initial implementation of the HTree directory index.
This is a port of NetBSD's GSoC 2012 Ext3 HTree directory indexing
by Vyacheslav Matyushin.  It was cleaned up and enhanced for FreeBSD
by Zheng Liu (lz@).

This is an excellent example of work shared among different projects:
Vyacheslav was able to look at an early prototype from Zheng Liu who
was also able to check the code from Haiku (with permission).

As in linux, the feature is not available by default and must be
enabled explicitly with tune2fs. We still do not support the
workarounds required in readdir for NFS.

Submitted by:	Zheng Liu
Tested by:	Mike Ma
Sponsored by:	Google Inc.
MFC after:	1 week
2013-07-06 18:28:06 +00:00
Konstantin Belousov
18a8d3d7f8 The tvp vnode on rename is usually unlinked. Drop the cached null
vnode for tvp to allow the free of the lower vnode, if needed.

PR:	kern/180236
Tested by:	smh
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2013-07-04 19:01:18 +00:00
Davide Italiano
9e9421bcdf - Fix double frees/user after free.
- Allocate using smb_rq_alloc() instead of inlining it.

Reported by:	uqs
Found with:	Coverity Scan
2013-07-03 10:31:45 +00:00
Rick Macklem
a820822ec8 A problem with the old NFS client where large writes to large files
would sometimes result in a corrupted file was reported via email.
This problem appears to have been caused by r251719 (reverting
r251719 fixed the problem). Although I have not been able to
reproduce this problem, I suspect it is caused by another thread
increasing np->n_size after the mtx_unlock(&np->n_mtx) but before
the vnode_pager_setsize() call. Since the np->n_mtx mutex serializes
updates to np->n_size, doing the vnode_pager_setsize() with the
mutex locked appears to avoid the problem.
Unfortunately, vnode_pager_setsize() where the new size is smaller,
cannot be called with a mutex held.
This patch returns the semantics to be close to pre-r251719 (actually
pre-r248567, r248581, r248567 for the new client) such that the call to
vnode_pager_setsize() is only delayed until after the mutex is
unlocked when np->n_size is shrinking. Since the file is growing
when being written, I believe this will fix the corruption.
A better solution might be to replace the mutex with a sleep lock,
but that is a non-trivial conversion, so this fix is hoped to be
sufficient in the meantime.

Reported by:	David G. Lawrence (dg@dglawrence.com)
Tested by:	David G. Lawrence (to be done soon)
Reviewed by:	kib
MFC after:	1 week
2013-07-03 00:19:03 +00:00
Pedro F. Giffuni
e2bc2ccec0 ext2fs: Use the complete random() range in i_gen.
i_gen is unsigned in ext2fs so we can handle the complete
32 bits.

MFC after:	1 week
2013-06-30 00:42:51 +00:00
Pedro F. Giffuni
d849f17dca Bring some updates from ufs_lookup to ext2fs.
r156418:

Don't set IN_CHANGE and IN_UPDATE on inodes for potentially suspended
file systems.  This could cause deadlocks when creating snapshots.
(We can't do snapshots on ext2fs but it is useful to keep things in sync).

r183079:

- Only set i_offset in the parent directory's i-node during a lookup for
  non-LOOKUP operations.
- Relax a VOP assertion for a DELETE lookup.

r187528:

Move the code from ufs_lookup.c used to do dotdot lookup, into
the helper function. It is supposed to be useful for any filesystem
that has to unlock dvp to walk to the ".." entry in lookup routine.

MFC after:	5 days
2013-06-29 01:35:28 +00:00
Davide Italiano
bbc6d2c1af Properly use v_data field. This magically worked (even if wrong) until
now because v_data is the first field of the structure, but it's not
something we should rely on.
2013-06-28 20:32:48 +00:00
Davide Italiano
189e41259b Garbage collect an useless check. smp should be never NULL. 2013-06-28 20:14:30 +00:00
Davide Italiano
c7d2e4cf9b Plug a couple of leakages in smbfs_lookup(). 2013-06-28 20:07:24 +00:00
Pedro F. Giffuni
fafb835a0b Minor sorting.
MFC after:	3 days
2013-06-26 19:43:22 +00:00
Pedro F. Giffuni
da057ed2d3 Define and use e2fs_lbn_t in ext2fs.
In line to what is done in UFS, define an internal type
e2fs_lbn_t for the logical block numbers.

This change is basically a no-op as the new type is unchanged
(int32_t) but it may be useful as bumping this may be required
for ext4fs.

Also, as pointed out by Bruce Evans:

-Use daddr_t for daddr in ext2_bmaparray(). This seems to
improve reliability with the reallocblks option.
- Add a cast to the fsbtodb() macro as in UFS.

Reviewed by:	bde
MFC after:	3 days
2013-06-23 02:44:42 +00:00
Rick Macklem
2e6a4b0c55 Fix r252074 so that it builds on 64bit arches. 2013-06-22 21:58:21 +00:00
Rick Macklem
1dd95a046c The NFSv4.1 LayoutCommit operation requires a valid offset and length.
(0, 0 is not sufficient) This patch a loop for each file layout, using
the offset, length of each file layout in a separate LayoutCommit.
2013-06-21 22:46:16 +00:00
Rick Macklem
562395581b When the NFSv4.1 client is writing to a pNFS Data Server (DS), the
file's size attribute does not get updated. As such, it is necessary
to invalidate the attribute cache before clearing NMODIFIED for pNFS.

MFC after:	2 weeks
2013-06-21 22:26:18 +00:00
Rick Macklem
315c38d135 Since some NFSv4 servers enforce the requirement for a reserved port#,
enable use of the (no)resvport mount option for NFSv4. I had thought
that the RFC required that non-reserved port #s be allowed, but I couldn't
find it in the RFC.

MFC after:	2 weeks
2013-06-21 19:41:30 +00:00
Pedro F. Giffuni
3f5747b69d Rename some prefixes in the Block Group Descriptor fields to ext4bgd_
Change prefix to avoid confusion and denote that these fields
are generally only available starting with ext4.

MFC after:	3 days
2013-06-20 00:00:33 +00:00
Pedro F. Giffuni
9e43acf6c0 More ext2fs header cleanups:
- Set MAXMNTLEN nearer to where it is used.
- Move EXT2_LINK_MAX to ext2_dir.h .

MFC after:	3 days
2013-06-18 15:49:30 +00:00
Pedro F. Giffuni
ebf0f88839 Rename remaining DIAGNOSTIC to INVARIANTS.
MFC after:	3 days
2013-06-17 00:39:23 +00:00
Pedro F. Giffuni
b6113fb31a Re-sort ext2fs headers to make things easier to find.
In the ext2fs driver we have a mixture of headers:

- The ext2_ prefixed headers have strong influence from NetBSD
and are carry specific ext2/3/4 information.
- The unprefixed headers are inspired on UFS and carry implementation
specific information.

Do some small adjustments so that the information is easier to
find coming from either UFS or the NetBSD implementation.

MFC after:	3 days
2013-06-16 16:10:45 +00:00
Pedro F. Giffuni
f744956b4a Relax some unnecessary unsigned type changes in ext2fs.
While the changes in r245820 are in line with the ext2 spec,
the code derived from UFS can use negative values so it is
better to relax some types to keep them as they were, and
somewhat more similar to UFS. While here clean some casts.

Some of the original types are still wrong and will require
more work.

Discussed with:	bde
MFC after:	3 days
2013-06-13 03:23:24 +00:00
Pedro F. Giffuni
77b193c249 Turn DIAGNOSTICs to INVARIANTS in ext2fs.
This is done to be consistent with what other filesystems and
particularly ffs already does (see r173464).

MFC after:	5 days
2013-06-12 15:24:48 +00:00
Pedro F. Giffuni
abe38ac774 s/file system/filesystem/g
Based on r96755 from UFS.

MFC after:	3 days
2013-06-11 02:47:07 +00:00
Pedro F. Giffuni
f7d4b4d3d1 e2fs_bpg and e2fs_isize are always unsigned.
The superblock in ext2fs defines all the fields as unsigned but for
some reason the in-memory superblock was carrying e2fs_bpg and
e2fs_isize as signed.

We should preserve the specified types for consistency.

MFC after:	5 days
2013-06-09 01:38:51 +00:00
Alan Cox
f50b6721e1 Add missing VM object unlocks in an error case.
Reviewed by:	kib
2013-06-07 19:42:00 +00:00
Alan Cox
27a18d6a23 Don't busy the page unless we are likely to release the object lock.
Reviewed by:	kib
Sponsored by:	EMC / Isilon Storage Division
2013-06-06 06:17:20 +00:00
Alan Cox
66c392df53 Relax the vm object locking. Use a read lock.
Sponsored by:	EMC / Isilon Storage Division
2013-06-05 17:00:10 +00:00
Alan Cox
ba887a9b33 Eliminate unnecessary vm object locking from tmpfs_nocacheread(). 2013-06-04 15:40:45 +00:00
Pedro F. Giffuni
532ebe1313 ext2fs: space vs tab.
Obtained from:	Christoph Mallon
MFC after:	3 days
2013-06-03 20:33:05 +00:00
Pedro F. Giffuni
fc3ea958b2 ext2fs: Small cosmetic fixes.
Make a long macro readable and sort a header.

Obtained from:	Christoph Mallon
MFC after:	3 days
2013-06-03 20:02:45 +00:00
Pedro F. Giffuni
4f69a09308 ext2fs: Update Block Group Descriptor struct.
Uncover some, previously reserved, fields that are used by Ext4.
These are currently unused but it is good to have them for future
reference.

Reviewed by:	bde
MFC after:	3 days
2013-06-03 18:52:14 +00:00
Jeff Roberson
22a722605d - Convert the bufobj lock to rwlock.
- Use a shared bufobj lock in getblk() and inmem().
 - Convert softdep's lk to rwlock to match the bufobj lock.
 - Move INFREECNT to b_flags and protect it with the buf lock.
 - Remove unnecessary locking around bremfree() and BKGRDINPROG.

Sponsored by:	EMC / Isilon Storage Division
Discussed with:	mckusick, kib, mdf
2013-05-31 00:43:41 +00:00
Konstantin Belousov
67b4ed4b88 Assert that OBJ_TMPFS flag on the vm object for the tmpfs node is
cleared when the tmpfs node is going away.

Tested by:	bdrewery, pho
2013-05-30 19:51:33 +00:00
Rick Macklem
734b03c38d Post-r248567, there were times when the client would return a
truncated directory for some NFS servers. This turned out to
be because the size of a directory reported by an NFS server
can be smaller that the ufs-like directory created from the
RPC XDR in the client. This patch fixes the problem by changing
r248567 so that vnode_pager_setsize() is only done for regular files.

Reported and tested by:	hartmut.brandt@dlr.de
Reviewed by:	kib
MFC after:	1 week
2013-05-28 22:36:01 +00:00
Konstantin Belousov
74c7ff1a0e Do not leak the NULLV_NOUNLOCK flag from the nullfs_unlink_lowervp(),
for the case when the nullfs vnode is not reclaimed.  Otherwise, later
reclamation would not unlock the lower vnode.

Reported by:	antoine
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2013-05-21 11:31:56 +00:00
Dag-Erling Smørgrav
72ccd4cc6b Fix typo in comment.
Submitted by:	Alex Weber <alexwebr@gmail.com>
MFC after:	1 week
2013-05-15 08:38:49 +00:00
Rick Macklem
77a03c148c Add support for the eofflag to nfs_readdir() in the new NFS
client so that it works under a unionfs mount.

Submitted by:	Jared Yanovich (slovichon@gmail.com)
Reviewed by:	kib
MFC after:	2 weeks
2013-05-12 21:48:08 +00:00
Eitan Adler
a164074fc4 Fix several typos
PR:		kern/176054
Submitted by:	Christoph Mallon <christoph.mallon@gmx.de>
MFC after:	3 days
2013-05-12 16:43:26 +00:00
Jilles Tjoelker
d3045c081d fdescfs: Supply a real value for d_type in readdir.
All the fdescfs nodes (except . and ..) appear as character devices to
stat(), so DT_CHR is correct.
2013-05-12 15:44:49 +00:00
Konstantin Belousov
0fc6daa72d - Fix nullfs vnode reference leak in nullfs_reclaim_lowervp(). The
null_hashget() obtains the reference on the nullfs vnode, which must
  be dropped.

- Fix a wart which existed from the introduction of the nullfs
  caching, do not unlock lower vnode in the nullfs_reclaim_lowervp().
  It should be innocent, but now it is also formally safe.  Inform the
  nullfs_reclaim() about this using the NULLV_NOUNLOCK flag set on
  nullfs inode.

- Add a callback to the upper filesystems for the lower vnode
  unlinking. When inactivating a nullfs vnode, check if the lower
  vnode was unlinked, indicated by nullfs flag NULLV_DROP or VV_NOSYNC
  on the lower vnode, and reclaim upper vnode if so.  This allows
  nullfs to purge cached vnodes for the unlinked lower vnode, avoiding
  excessive caching.

Reported by:	G??ran L??wkrantz <goran.lowkrantz@ismobile.com>
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2013-05-11 11:17:44 +00:00
Konstantin Belousov
3fa456b35d Avoid deactivating the page if it is already on a queue, only requeue
the page.  This both reduces the number of queues locking and avoids
moving the active page to inactive list just because the page was read
or written.

Based on the suggestion by:	alc
Reviewed by: alc
Tested by:   pho
2013-05-06 21:04:42 +00:00
Davide Italiano
caa8e38fa6 Change VM_OBJECT_LOCK/UNLOCK() -> VM_OBJECT_WLOCK/WUNLOCK() to reflect
the recent switch of the vm object lock to a rwlock.

Reported by:	attilio
2013-05-04 14:27:28 +00:00
Davide Italiano
a4c059845a Overhaul locking in netsmb, getting rid of the obsolete lockmgr() primitive.
This solves a long standing LOR between smb_conn and smb_vc.

Tested by:	martymac, pho (previous version)
2013-05-04 14:18:10 +00:00
Davide Italiano
92a4d9bcc8 Completely rewrite the interface to smbdev switching from dev_clone
to cdevpriv(9). This commit changes the semantic of mount_smbfs
in userland as well, which now passes file descriptor in order to
to mount a specific filesystem istance.

Reviewed by:	attilio, ed
Tested by:	martymac
2013-05-04 14:03:18 +00:00
Konstantin Belousov
293e4eb67d The fsync(2) call should sync the vnode in such way that even after
system crash which happen after successfull fsync() return, the data
is accessible.  For msdosfs, this means that FAT entries for the file
must be written.

Since we do not track the FAT blocks containing entries for the
current file, just do a sloppy sync of the devvp vnode for the mount,
which buffers, among other things, contain FAT blocks.

Simultaneously, for deupdat():
- optimize by clearing the modified flags before short-circuiting a
  return, if the mount is read-only;
- only ignore the rest of the function for denode with DE_MODIFIED
  flag clear when the waitfor argument is false.  The directory buffer
  for the entry might be of delayed write;
- microoptimize by comparing the updated directory entry with the
  current block content;
- try to cluster the write, fall back to bawrite() if low on
  resources.

Based on the submission by:	bde
MFC after:	2 weeks
2013-05-02 20:00:11 +00:00
Konstantin Belousov
df6b240b6f Fix the v_object leak for non-regular tmpfs vnodes.
Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-05-02 18:46:31 +00:00
Konstantin Belousov
158cc900bb For the new regular tmpfs vnode, v_object is initialized before
insmntque() is called.  The standard insmntque destructor resets the
vop vector to deadfs one, and calls vgone() on the vnode.  As result,
v_object is kept unchanged, which triggers an assertion in the reclaim
code, on instmntque() failure.  Also, in this case, OBJ_TMPFS flag on
the backed vm object is not cleared.

Provide the tmpfs insmntque() destructor which properly clears
OBJ_TMPFS flag and resets v_object.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-05-02 18:44:31 +00:00