3084 Commits

Author SHA1 Message Date
davide
6321e04894 - Fix double frees/user after free.
- Allocate using smb_rq_alloc() instead of inlining it.

Reported by:	uqs
Found with:	Coverity Scan
2013-07-03 10:31:45 +00:00
rmacklem
8b2d48c78b A problem with the old NFS client where large writes to large files
would sometimes result in a corrupted file was reported via email.
This problem appears to have been caused by r251719 (reverting
r251719 fixed the problem). Although I have not been able to
reproduce this problem, I suspect it is caused by another thread
increasing np->n_size after the mtx_unlock(&np->n_mtx) but before
the vnode_pager_setsize() call. Since the np->n_mtx mutex serializes
updates to np->n_size, doing the vnode_pager_setsize() with the
mutex locked appears to avoid the problem.
Unfortunately, vnode_pager_setsize() where the new size is smaller,
cannot be called with a mutex held.
This patch returns the semantics to be close to pre-r251719 (actually
pre-r248567, r248581, r248567 for the new client) such that the call to
vnode_pager_setsize() is only delayed until after the mutex is
unlocked when np->n_size is shrinking. Since the file is growing
when being written, I believe this will fix the corruption.
A better solution might be to replace the mutex with a sleep lock,
but that is a non-trivial conversion, so this fix is hoped to be
sufficient in the meantime.

Reported by:	David G. Lawrence (dg@dglawrence.com)
Tested by:	David G. Lawrence (to be done soon)
Reviewed by:	kib
MFC after:	1 week
2013-07-03 00:19:03 +00:00
pfg
b08e5aff18 ext2fs: Use the complete random() range in i_gen.
i_gen is unsigned in ext2fs so we can handle the complete
32 bits.

MFC after:	1 week
2013-06-30 00:42:51 +00:00
pfg
7e708941c4 Bring some updates from ufs_lookup to ext2fs.
r156418:

Don't set IN_CHANGE and IN_UPDATE on inodes for potentially suspended
file systems.  This could cause deadlocks when creating snapshots.
(We can't do snapshots on ext2fs but it is useful to keep things in sync).

r183079:

- Only set i_offset in the parent directory's i-node during a lookup for
  non-LOOKUP operations.
- Relax a VOP assertion for a DELETE lookup.

r187528:

Move the code from ufs_lookup.c used to do dotdot lookup, into
the helper function. It is supposed to be useful for any filesystem
that has to unlock dvp to walk to the ".." entry in lookup routine.

MFC after:	5 days
2013-06-29 01:35:28 +00:00
davide
a0c5d96b0a Properly use v_data field. This magically worked (even if wrong) until
now because v_data is the first field of the structure, but it's not
something we should rely on.
2013-06-28 20:32:48 +00:00
davide
a47b6265f2 Garbage collect an useless check. smp should be never NULL. 2013-06-28 20:14:30 +00:00
davide
cf63c799c6 Plug a couple of leakages in smbfs_lookup(). 2013-06-28 20:07:24 +00:00
pfg
ef282086af Minor sorting.
MFC after:	3 days
2013-06-26 19:43:22 +00:00
pfg
456a58318f Define and use e2fs_lbn_t in ext2fs.
In line to what is done in UFS, define an internal type
e2fs_lbn_t for the logical block numbers.

This change is basically a no-op as the new type is unchanged
(int32_t) but it may be useful as bumping this may be required
for ext4fs.

Also, as pointed out by Bruce Evans:

-Use daddr_t for daddr in ext2_bmaparray(). This seems to
improve reliability with the reallocblks option.
- Add a cast to the fsbtodb() macro as in UFS.

Reviewed by:	bde
MFC after:	3 days
2013-06-23 02:44:42 +00:00
rmacklem
198db6adb9 Fix r252074 so that it builds on 64bit arches. 2013-06-22 21:58:21 +00:00
rmacklem
121288ceb2 The NFSv4.1 LayoutCommit operation requires a valid offset and length.
(0, 0 is not sufficient) This patch a loop for each file layout, using
the offset, length of each file layout in a separate LayoutCommit.
2013-06-21 22:46:16 +00:00
rmacklem
b3e84941b0 When the NFSv4.1 client is writing to a pNFS Data Server (DS), the
file's size attribute does not get updated. As such, it is necessary
to invalidate the attribute cache before clearing NMODIFIED for pNFS.

MFC after:	2 weeks
2013-06-21 22:26:18 +00:00
rmacklem
2bc76c9d96 Since some NFSv4 servers enforce the requirement for a reserved port#,
enable use of the (no)resvport mount option for NFSv4. I had thought
that the RFC required that non-reserved port #s be allowed, but I couldn't
find it in the RFC.

MFC after:	2 weeks
2013-06-21 19:41:30 +00:00
pfg
ef31b55ff7 Rename some prefixes in the Block Group Descriptor fields to ext4bgd_
Change prefix to avoid confusion and denote that these fields
are generally only available starting with ext4.

MFC after:	3 days
2013-06-20 00:00:33 +00:00
pfg
9ac60dd284 More ext2fs header cleanups:
- Set MAXMNTLEN nearer to where it is used.
- Move EXT2_LINK_MAX to ext2_dir.h .

MFC after:	3 days
2013-06-18 15:49:30 +00:00
pfg
fb29984476 Rename remaining DIAGNOSTIC to INVARIANTS.
MFC after:	3 days
2013-06-17 00:39:23 +00:00
pfg
5cf909a854 Re-sort ext2fs headers to make things easier to find.
In the ext2fs driver we have a mixture of headers:

- The ext2_ prefixed headers have strong influence from NetBSD
and are carry specific ext2/3/4 information.
- The unprefixed headers are inspired on UFS and carry implementation
specific information.

Do some small adjustments so that the information is easier to
find coming from either UFS or the NetBSD implementation.

MFC after:	3 days
2013-06-16 16:10:45 +00:00
pfg
4d854ec921 Relax some unnecessary unsigned type changes in ext2fs.
While the changes in r245820 are in line with the ext2 spec,
the code derived from UFS can use negative values so it is
better to relax some types to keep them as they were, and
somewhat more similar to UFS. While here clean some casts.

Some of the original types are still wrong and will require
more work.

Discussed with:	bde
MFC after:	3 days
2013-06-13 03:23:24 +00:00
pfg
e9fce1a99b Turn DIAGNOSTICs to INVARIANTS in ext2fs.
This is done to be consistent with what other filesystems and
particularly ffs already does (see r173464).

MFC after:	5 days
2013-06-12 15:24:48 +00:00
pfg
adb723b809 s/file system/filesystem/g
Based on r96755 from UFS.

MFC after:	3 days
2013-06-11 02:47:07 +00:00
pfg
a8d161091f e2fs_bpg and e2fs_isize are always unsigned.
The superblock in ext2fs defines all the fields as unsigned but for
some reason the in-memory superblock was carrying e2fs_bpg and
e2fs_isize as signed.

We should preserve the specified types for consistency.

MFC after:	5 days
2013-06-09 01:38:51 +00:00
alc
47492525b1 Add missing VM object unlocks in an error case.
Reviewed by:	kib
2013-06-07 19:42:00 +00:00
alc
e60ab9c72d Don't busy the page unless we are likely to release the object lock.
Reviewed by:	kib
Sponsored by:	EMC / Isilon Storage Division
2013-06-06 06:17:20 +00:00
alc
b4fae70474 Relax the vm object locking. Use a read lock.
Sponsored by:	EMC / Isilon Storage Division
2013-06-05 17:00:10 +00:00
alc
e4b61d89d1 Eliminate unnecessary vm object locking from tmpfs_nocacheread(). 2013-06-04 15:40:45 +00:00
pfg
f11bbd3dc5 ext2fs: space vs tab.
Obtained from:	Christoph Mallon
MFC after:	3 days
2013-06-03 20:33:05 +00:00
pfg
1687488775 ext2fs: Small cosmetic fixes.
Make a long macro readable and sort a header.

Obtained from:	Christoph Mallon
MFC after:	3 days
2013-06-03 20:02:45 +00:00
pfg
94ecf49873 ext2fs: Update Block Group Descriptor struct.
Uncover some, previously reserved, fields that are used by Ext4.
These are currently unused but it is good to have them for future
reference.

Reviewed by:	bde
MFC after:	3 days
2013-06-03 18:52:14 +00:00
jeff
d7efebc4db - Convert the bufobj lock to rwlock.
- Use a shared bufobj lock in getblk() and inmem().
 - Convert softdep's lk to rwlock to match the bufobj lock.
 - Move INFREECNT to b_flags and protect it with the buf lock.
 - Remove unnecessary locking around bremfree() and BKGRDINPROG.

Sponsored by:	EMC / Isilon Storage Division
Discussed with:	mckusick, kib, mdf
2013-05-31 00:43:41 +00:00
kib
b94a58f986 Assert that OBJ_TMPFS flag on the vm object for the tmpfs node is
cleared when the tmpfs node is going away.

Tested by:	bdrewery, pho
2013-05-30 19:51:33 +00:00
rmacklem
ff1318447e Post-r248567, there were times when the client would return a
truncated directory for some NFS servers. This turned out to
be because the size of a directory reported by an NFS server
can be smaller that the ufs-like directory created from the
RPC XDR in the client. This patch fixes the problem by changing
r248567 so that vnode_pager_setsize() is only done for regular files.

Reported and tested by:	hartmut.brandt@dlr.de
Reviewed by:	kib
MFC after:	1 week
2013-05-28 22:36:01 +00:00
kib
e68d3a11b7 Do not leak the NULLV_NOUNLOCK flag from the nullfs_unlink_lowervp(),
for the case when the nullfs vnode is not reclaimed.  Otherwise, later
reclamation would not unlock the lower vnode.

Reported by:	antoine
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2013-05-21 11:31:56 +00:00
des
0a904132b4 Fix typo in comment.
Submitted by:	Alex Weber <alexwebr@gmail.com>
MFC after:	1 week
2013-05-15 08:38:49 +00:00
rmacklem
4bf4286804 Add support for the eofflag to nfs_readdir() in the new NFS
client so that it works under a unionfs mount.

Submitted by:	Jared Yanovich (slovichon@gmail.com)
Reviewed by:	kib
MFC after:	2 weeks
2013-05-12 21:48:08 +00:00
eadler
6907881cb8 Fix several typos
PR:		kern/176054
Submitted by:	Christoph Mallon <christoph.mallon@gmx.de>
MFC after:	3 days
2013-05-12 16:43:26 +00:00
jilles
88b36000c5 fdescfs: Supply a real value for d_type in readdir.
All the fdescfs nodes (except . and ..) appear as character devices to
stat(), so DT_CHR is correct.
2013-05-12 15:44:49 +00:00
kib
dfd7a7f46d - Fix nullfs vnode reference leak in nullfs_reclaim_lowervp(). The
null_hashget() obtains the reference on the nullfs vnode, which must
  be dropped.

- Fix a wart which existed from the introduction of the nullfs
  caching, do not unlock lower vnode in the nullfs_reclaim_lowervp().
  It should be innocent, but now it is also formally safe.  Inform the
  nullfs_reclaim() about this using the NULLV_NOUNLOCK flag set on
  nullfs inode.

- Add a callback to the upper filesystems for the lower vnode
  unlinking. When inactivating a nullfs vnode, check if the lower
  vnode was unlinked, indicated by nullfs flag NULLV_DROP or VV_NOSYNC
  on the lower vnode, and reclaim upper vnode if so.  This allows
  nullfs to purge cached vnodes for the unlinked lower vnode, avoiding
  excessive caching.

Reported by:	G??ran L??wkrantz <goran.lowkrantz@ismobile.com>
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2013-05-11 11:17:44 +00:00
kib
37807bfa8a Avoid deactivating the page if it is already on a queue, only requeue
the page.  This both reduces the number of queues locking and avoids
moving the active page to inactive list just because the page was read
or written.

Based on the suggestion by:	alc
Reviewed by: alc
Tested by:   pho
2013-05-06 21:04:42 +00:00
davide
d0199b6a2f Change VM_OBJECT_LOCK/UNLOCK() -> VM_OBJECT_WLOCK/WUNLOCK() to reflect
the recent switch of the vm object lock to a rwlock.

Reported by:	attilio
2013-05-04 14:27:28 +00:00
davide
8fda0a3d2e Overhaul locking in netsmb, getting rid of the obsolete lockmgr() primitive.
This solves a long standing LOR between smb_conn and smb_vc.

Tested by:	martymac, pho (previous version)
2013-05-04 14:18:10 +00:00
davide
49171951e3 Completely rewrite the interface to smbdev switching from dev_clone
to cdevpriv(9). This commit changes the semantic of mount_smbfs
in userland as well, which now passes file descriptor in order to
to mount a specific filesystem istance.

Reviewed by:	attilio, ed
Tested by:	martymac
2013-05-04 14:03:18 +00:00
kib
3bbdcb77fd The fsync(2) call should sync the vnode in such way that even after
system crash which happen after successfull fsync() return, the data
is accessible.  For msdosfs, this means that FAT entries for the file
must be written.

Since we do not track the FAT blocks containing entries for the
current file, just do a sloppy sync of the devvp vnode for the mount,
which buffers, among other things, contain FAT blocks.

Simultaneously, for deupdat():
- optimize by clearing the modified flags before short-circuiting a
  return, if the mount is read-only;
- only ignore the rest of the function for denode with DE_MODIFIED
  flag clear when the waitfor argument is false.  The directory buffer
  for the entry might be of delayed write;
- microoptimize by comparing the updated directory entry with the
  current block content;
- try to cluster the write, fall back to bawrite() if low on
  resources.

Based on the submission by:	bde
MFC after:	2 weeks
2013-05-02 20:00:11 +00:00
kib
a0955c41ee Fix the v_object leak for non-regular tmpfs vnodes.
Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-05-02 18:46:31 +00:00
kib
5830ee0d3f For the new regular tmpfs vnode, v_object is initialized before
insmntque() is called.  The standard insmntque destructor resets the
vop vector to deadfs one, and calls vgone() on the vnode.  As result,
v_object is kept unchanged, which triggers an assertion in the reclaim
code, on instmntque() failure.  Also, in this case, OBJ_TMPFS flag on
the backed vm object is not cleared.

Provide the tmpfs insmntque() destructor which properly clears
OBJ_TMPFS flag and resets v_object.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-05-02 18:44:31 +00:00
kib
3a177062d5 The page read or written could be wired. Do not requeue if the page
is not on a queue.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-05-02 18:36:52 +00:00
des
6cfe294ded Fix a bug that allows NFS clients to issue READDIR on files.
PR:		kern/178016
Security:	CVE-2013-3266
Security:	FreeBSD-SA-13:05.nfsserver
2013-04-29 20:09:44 +00:00
kib
2f2c1edec8 Rework the handling of the tmpfs node backing swap object and tmpfs
vnode v_object to avoid double-buffering.  Use the same object both as
the backing store for tmpfs node and as the v_object.

Besides reducing memory use up to 2x times for situation of mapping
files from tmpfs, it also makes tmpfs read and write operations copy
twice bytes less.

VM subsystem was already slightly adapted to tolerate OBJT_SWAP object
as v_object. Now the vm_object_deallocate() is modified to not
reinstantiate OBJ_ONEMAPPING flag and help the VFS to correctly handle
VV_TEXT flag on the last dereference of the tmpfs backing object.

Reviewed by:	alc
Tested by:	pho, bf
MFC after:	1 month
2013-04-28 19:38:59 +00:00
rmacklem
1110825468 When an NFS unmount occurs, once vflush() writes the last dirty
buffer for the last vnode on the mount back to the server, it
returns. At that point, the code continues with the unmount,
including freeing up the nfs specific part of the mount structure.
It is possible that an nfsiod thread will try to check for an
empty I/O queue in the nfs specific part of the mount structure
after it has been free'd by the unmount. This patch avoids this problem by
setting the iodmount entries for the mount back to NULL while holding the
mutex in the unmount and checking the appropriate entry is non-NULL after
acquiring the mutex in the nfsiod thread.

Reported and tested by:	pho
Reviewed by:	kib
MFC after:	2 weeks
2013-04-18 23:20:16 +00:00
rmacklem
ef3a1169e5 Both NFS clients can deadlock when using the "rdirplus" mount
option. This can occur when an nfsiod thread that already holds
a buffer lock attempts to acquire a vnode lock on an entry in
the directory (a LOR) when another thread holding the vnode lock
is waiting on an nfsiod thread. This patch avoids the deadlock by disabling
readahead for this case, so the nfsiod threads never do readdirplus.
Since readaheads for directories need the directory offset cookie
from the previous read, they cannot normally happen in parallel.
As such, testing by jhb@ and myself didn't find any performance
degredation when this patch is applied. If there is a case where
this results in a significant performance degradation, mounting
without the "rdirplus" option can be done to re-enable readahead
for directories.

Reported and tested by:	jhb
Reviewed by:	jhb
MFC after:	2 weeks
2013-04-18 13:09:04 +00:00
ken
1ab8c6ab7f Move the NFS FHA (File Handle Affinity) code from sys/nfsserver to
sys/nfs, since it is now shared by the two NFS servers.

Suggested by:	rmacklem
Sponsored by:	Spectra Logic
MFC after:	2 weeks
2013-04-17 22:42:43 +00:00