Commit Graph

3029 Commits

Author SHA1 Message Date
Davide Italiano
9e9421bcdf - Fix double frees/user after free.
- Allocate using smb_rq_alloc() instead of inlining it.

Reported by:	uqs
Found with:	Coverity Scan
2013-07-03 10:31:45 +00:00
Rick Macklem
a820822ec8 A problem with the old NFS client where large writes to large files
would sometimes result in a corrupted file was reported via email.
This problem appears to have been caused by r251719 (reverting
r251719 fixed the problem). Although I have not been able to
reproduce this problem, I suspect it is caused by another thread
increasing np->n_size after the mtx_unlock(&np->n_mtx) but before
the vnode_pager_setsize() call. Since the np->n_mtx mutex serializes
updates to np->n_size, doing the vnode_pager_setsize() with the
mutex locked appears to avoid the problem.
Unfortunately, vnode_pager_setsize() where the new size is smaller,
cannot be called with a mutex held.
This patch returns the semantics to be close to pre-r251719 (actually
pre-r248567, r248581, r248567 for the new client) such that the call to
vnode_pager_setsize() is only delayed until after the mutex is
unlocked when np->n_size is shrinking. Since the file is growing
when being written, I believe this will fix the corruption.
A better solution might be to replace the mutex with a sleep lock,
but that is a non-trivial conversion, so this fix is hoped to be
sufficient in the meantime.

Reported by:	David G. Lawrence (dg@dglawrence.com)
Tested by:	David G. Lawrence (to be done soon)
Reviewed by:	kib
MFC after:	1 week
2013-07-03 00:19:03 +00:00
Pedro F. Giffuni
e2bc2ccec0 ext2fs: Use the complete random() range in i_gen.
i_gen is unsigned in ext2fs so we can handle the complete
32 bits.

MFC after:	1 week
2013-06-30 00:42:51 +00:00
Pedro F. Giffuni
d849f17dca Bring some updates from ufs_lookup to ext2fs.
r156418:

Don't set IN_CHANGE and IN_UPDATE on inodes for potentially suspended
file systems.  This could cause deadlocks when creating snapshots.
(We can't do snapshots on ext2fs but it is useful to keep things in sync).

r183079:

- Only set i_offset in the parent directory's i-node during a lookup for
  non-LOOKUP operations.
- Relax a VOP assertion for a DELETE lookup.

r187528:

Move the code from ufs_lookup.c used to do dotdot lookup, into
the helper function. It is supposed to be useful for any filesystem
that has to unlock dvp to walk to the ".." entry in lookup routine.

MFC after:	5 days
2013-06-29 01:35:28 +00:00
Davide Italiano
bbc6d2c1af Properly use v_data field. This magically worked (even if wrong) until
now because v_data is the first field of the structure, but it's not
something we should rely on.
2013-06-28 20:32:48 +00:00
Davide Italiano
189e41259b Garbage collect an useless check. smp should be never NULL. 2013-06-28 20:14:30 +00:00
Davide Italiano
c7d2e4cf9b Plug a couple of leakages in smbfs_lookup(). 2013-06-28 20:07:24 +00:00
Pedro F. Giffuni
fafb835a0b Minor sorting.
MFC after:	3 days
2013-06-26 19:43:22 +00:00
Pedro F. Giffuni
da057ed2d3 Define and use e2fs_lbn_t in ext2fs.
In line to what is done in UFS, define an internal type
e2fs_lbn_t for the logical block numbers.

This change is basically a no-op as the new type is unchanged
(int32_t) but it may be useful as bumping this may be required
for ext4fs.

Also, as pointed out by Bruce Evans:

-Use daddr_t for daddr in ext2_bmaparray(). This seems to
improve reliability with the reallocblks option.
- Add a cast to the fsbtodb() macro as in UFS.

Reviewed by:	bde
MFC after:	3 days
2013-06-23 02:44:42 +00:00
Rick Macklem
2e6a4b0c55 Fix r252074 so that it builds on 64bit arches. 2013-06-22 21:58:21 +00:00
Rick Macklem
1dd95a046c The NFSv4.1 LayoutCommit operation requires a valid offset and length.
(0, 0 is not sufficient) This patch a loop for each file layout, using
the offset, length of each file layout in a separate LayoutCommit.
2013-06-21 22:46:16 +00:00
Rick Macklem
562395581b When the NFSv4.1 client is writing to a pNFS Data Server (DS), the
file's size attribute does not get updated. As such, it is necessary
to invalidate the attribute cache before clearing NMODIFIED for pNFS.

MFC after:	2 weeks
2013-06-21 22:26:18 +00:00
Rick Macklem
315c38d135 Since some NFSv4 servers enforce the requirement for a reserved port#,
enable use of the (no)resvport mount option for NFSv4. I had thought
that the RFC required that non-reserved port #s be allowed, but I couldn't
find it in the RFC.

MFC after:	2 weeks
2013-06-21 19:41:30 +00:00
Pedro F. Giffuni
3f5747b69d Rename some prefixes in the Block Group Descriptor fields to ext4bgd_
Change prefix to avoid confusion and denote that these fields
are generally only available starting with ext4.

MFC after:	3 days
2013-06-20 00:00:33 +00:00
Pedro F. Giffuni
9e43acf6c0 More ext2fs header cleanups:
- Set MAXMNTLEN nearer to where it is used.
- Move EXT2_LINK_MAX to ext2_dir.h .

MFC after:	3 days
2013-06-18 15:49:30 +00:00
Pedro F. Giffuni
ebf0f88839 Rename remaining DIAGNOSTIC to INVARIANTS.
MFC after:	3 days
2013-06-17 00:39:23 +00:00
Pedro F. Giffuni
b6113fb31a Re-sort ext2fs headers to make things easier to find.
In the ext2fs driver we have a mixture of headers:

- The ext2_ prefixed headers have strong influence from NetBSD
and are carry specific ext2/3/4 information.
- The unprefixed headers are inspired on UFS and carry implementation
specific information.

Do some small adjustments so that the information is easier to
find coming from either UFS or the NetBSD implementation.

MFC after:	3 days
2013-06-16 16:10:45 +00:00
Pedro F. Giffuni
f744956b4a Relax some unnecessary unsigned type changes in ext2fs.
While the changes in r245820 are in line with the ext2 spec,
the code derived from UFS can use negative values so it is
better to relax some types to keep them as they were, and
somewhat more similar to UFS. While here clean some casts.

Some of the original types are still wrong and will require
more work.

Discussed with:	bde
MFC after:	3 days
2013-06-13 03:23:24 +00:00
Pedro F. Giffuni
77b193c249 Turn DIAGNOSTICs to INVARIANTS in ext2fs.
This is done to be consistent with what other filesystems and
particularly ffs already does (see r173464).

MFC after:	5 days
2013-06-12 15:24:48 +00:00
Pedro F. Giffuni
abe38ac774 s/file system/filesystem/g
Based on r96755 from UFS.

MFC after:	3 days
2013-06-11 02:47:07 +00:00
Pedro F. Giffuni
f7d4b4d3d1 e2fs_bpg and e2fs_isize are always unsigned.
The superblock in ext2fs defines all the fields as unsigned but for
some reason the in-memory superblock was carrying e2fs_bpg and
e2fs_isize as signed.

We should preserve the specified types for consistency.

MFC after:	5 days
2013-06-09 01:38:51 +00:00
Alan Cox
f50b6721e1 Add missing VM object unlocks in an error case.
Reviewed by:	kib
2013-06-07 19:42:00 +00:00
Alan Cox
27a18d6a23 Don't busy the page unless we are likely to release the object lock.
Reviewed by:	kib
Sponsored by:	EMC / Isilon Storage Division
2013-06-06 06:17:20 +00:00
Alan Cox
66c392df53 Relax the vm object locking. Use a read lock.
Sponsored by:	EMC / Isilon Storage Division
2013-06-05 17:00:10 +00:00
Alan Cox
ba887a9b33 Eliminate unnecessary vm object locking from tmpfs_nocacheread(). 2013-06-04 15:40:45 +00:00
Pedro F. Giffuni
532ebe1313 ext2fs: space vs tab.
Obtained from:	Christoph Mallon
MFC after:	3 days
2013-06-03 20:33:05 +00:00
Pedro F. Giffuni
fc3ea958b2 ext2fs: Small cosmetic fixes.
Make a long macro readable and sort a header.

Obtained from:	Christoph Mallon
MFC after:	3 days
2013-06-03 20:02:45 +00:00
Pedro F. Giffuni
4f69a09308 ext2fs: Update Block Group Descriptor struct.
Uncover some, previously reserved, fields that are used by Ext4.
These are currently unused but it is good to have them for future
reference.

Reviewed by:	bde
MFC after:	3 days
2013-06-03 18:52:14 +00:00
Jeff Roberson
22a722605d - Convert the bufobj lock to rwlock.
- Use a shared bufobj lock in getblk() and inmem().
 - Convert softdep's lk to rwlock to match the bufobj lock.
 - Move INFREECNT to b_flags and protect it with the buf lock.
 - Remove unnecessary locking around bremfree() and BKGRDINPROG.

Sponsored by:	EMC / Isilon Storage Division
Discussed with:	mckusick, kib, mdf
2013-05-31 00:43:41 +00:00
Konstantin Belousov
67b4ed4b88 Assert that OBJ_TMPFS flag on the vm object for the tmpfs node is
cleared when the tmpfs node is going away.

Tested by:	bdrewery, pho
2013-05-30 19:51:33 +00:00
Rick Macklem
734b03c38d Post-r248567, there were times when the client would return a
truncated directory for some NFS servers. This turned out to
be because the size of a directory reported by an NFS server
can be smaller that the ufs-like directory created from the
RPC XDR in the client. This patch fixes the problem by changing
r248567 so that vnode_pager_setsize() is only done for regular files.

Reported and tested by:	hartmut.brandt@dlr.de
Reviewed by:	kib
MFC after:	1 week
2013-05-28 22:36:01 +00:00
Konstantin Belousov
74c7ff1a0e Do not leak the NULLV_NOUNLOCK flag from the nullfs_unlink_lowervp(),
for the case when the nullfs vnode is not reclaimed.  Otherwise, later
reclamation would not unlock the lower vnode.

Reported by:	antoine
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2013-05-21 11:31:56 +00:00
Dag-Erling Smørgrav
72ccd4cc6b Fix typo in comment.
Submitted by:	Alex Weber <alexwebr@gmail.com>
MFC after:	1 week
2013-05-15 08:38:49 +00:00
Rick Macklem
77a03c148c Add support for the eofflag to nfs_readdir() in the new NFS
client so that it works under a unionfs mount.

Submitted by:	Jared Yanovich (slovichon@gmail.com)
Reviewed by:	kib
MFC after:	2 weeks
2013-05-12 21:48:08 +00:00
Eitan Adler
a164074fc4 Fix several typos
PR:		kern/176054
Submitted by:	Christoph Mallon <christoph.mallon@gmx.de>
MFC after:	3 days
2013-05-12 16:43:26 +00:00
Jilles Tjoelker
d3045c081d fdescfs: Supply a real value for d_type in readdir.
All the fdescfs nodes (except . and ..) appear as character devices to
stat(), so DT_CHR is correct.
2013-05-12 15:44:49 +00:00
Konstantin Belousov
0fc6daa72d - Fix nullfs vnode reference leak in nullfs_reclaim_lowervp(). The
null_hashget() obtains the reference on the nullfs vnode, which must
  be dropped.

- Fix a wart which existed from the introduction of the nullfs
  caching, do not unlock lower vnode in the nullfs_reclaim_lowervp().
  It should be innocent, but now it is also formally safe.  Inform the
  nullfs_reclaim() about this using the NULLV_NOUNLOCK flag set on
  nullfs inode.

- Add a callback to the upper filesystems for the lower vnode
  unlinking. When inactivating a nullfs vnode, check if the lower
  vnode was unlinked, indicated by nullfs flag NULLV_DROP or VV_NOSYNC
  on the lower vnode, and reclaim upper vnode if so.  This allows
  nullfs to purge cached vnodes for the unlinked lower vnode, avoiding
  excessive caching.

Reported by:	G??ran L??wkrantz <goran.lowkrantz@ismobile.com>
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2013-05-11 11:17:44 +00:00
Konstantin Belousov
3fa456b35d Avoid deactivating the page if it is already on a queue, only requeue
the page.  This both reduces the number of queues locking and avoids
moving the active page to inactive list just because the page was read
or written.

Based on the suggestion by:	alc
Reviewed by: alc
Tested by:   pho
2013-05-06 21:04:42 +00:00
Davide Italiano
caa8e38fa6 Change VM_OBJECT_LOCK/UNLOCK() -> VM_OBJECT_WLOCK/WUNLOCK() to reflect
the recent switch of the vm object lock to a rwlock.

Reported by:	attilio
2013-05-04 14:27:28 +00:00
Davide Italiano
a4c059845a Overhaul locking in netsmb, getting rid of the obsolete lockmgr() primitive.
This solves a long standing LOR between smb_conn and smb_vc.

Tested by:	martymac, pho (previous version)
2013-05-04 14:18:10 +00:00
Davide Italiano
92a4d9bcc8 Completely rewrite the interface to smbdev switching from dev_clone
to cdevpriv(9). This commit changes the semantic of mount_smbfs
in userland as well, which now passes file descriptor in order to
to mount a specific filesystem istance.

Reviewed by:	attilio, ed
Tested by:	martymac
2013-05-04 14:03:18 +00:00
Konstantin Belousov
293e4eb67d The fsync(2) call should sync the vnode in such way that even after
system crash which happen after successfull fsync() return, the data
is accessible.  For msdosfs, this means that FAT entries for the file
must be written.

Since we do not track the FAT blocks containing entries for the
current file, just do a sloppy sync of the devvp vnode for the mount,
which buffers, among other things, contain FAT blocks.

Simultaneously, for deupdat():
- optimize by clearing the modified flags before short-circuiting a
  return, if the mount is read-only;
- only ignore the rest of the function for denode with DE_MODIFIED
  flag clear when the waitfor argument is false.  The directory buffer
  for the entry might be of delayed write;
- microoptimize by comparing the updated directory entry with the
  current block content;
- try to cluster the write, fall back to bawrite() if low on
  resources.

Based on the submission by:	bde
MFC after:	2 weeks
2013-05-02 20:00:11 +00:00
Konstantin Belousov
df6b240b6f Fix the v_object leak for non-regular tmpfs vnodes.
Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-05-02 18:46:31 +00:00
Konstantin Belousov
158cc900bb For the new regular tmpfs vnode, v_object is initialized before
insmntque() is called.  The standard insmntque destructor resets the
vop vector to deadfs one, and calls vgone() on the vnode.  As result,
v_object is kept unchanged, which triggers an assertion in the reclaim
code, on instmntque() failure.  Also, in this case, OBJ_TMPFS flag on
the backed vm object is not cleared.

Provide the tmpfs insmntque() destructor which properly clears
OBJ_TMPFS flag and resets v_object.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-05-02 18:44:31 +00:00
Konstantin Belousov
bdefcb6959 The page read or written could be wired. Do not requeue if the page
is not on a queue.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
2013-05-02 18:36:52 +00:00
Dag-Erling Smørgrav
c93c82f464 Fix a bug that allows NFS clients to issue READDIR on files.
PR:		kern/178016
Security:	CVE-2013-3266
Security:	FreeBSD-SA-13:05.nfsserver
2013-04-29 20:09:44 +00:00
Konstantin Belousov
6f2af3fcf3 Rework the handling of the tmpfs node backing swap object and tmpfs
vnode v_object to avoid double-buffering.  Use the same object both as
the backing store for tmpfs node and as the v_object.

Besides reducing memory use up to 2x times for situation of mapping
files from tmpfs, it also makes tmpfs read and write operations copy
twice bytes less.

VM subsystem was already slightly adapted to tolerate OBJT_SWAP object
as v_object. Now the vm_object_deallocate() is modified to not
reinstantiate OBJ_ONEMAPPING flag and help the VFS to correctly handle
VV_TEXT flag on the last dereference of the tmpfs backing object.

Reviewed by:	alc
Tested by:	pho, bf
MFC after:	1 month
2013-04-28 19:38:59 +00:00
Rick Macklem
64a0e848ab When an NFS unmount occurs, once vflush() writes the last dirty
buffer for the last vnode on the mount back to the server, it
returns. At that point, the code continues with the unmount,
including freeing up the nfs specific part of the mount structure.
It is possible that an nfsiod thread will try to check for an
empty I/O queue in the nfs specific part of the mount structure
after it has been free'd by the unmount. This patch avoids this problem by
setting the iodmount entries for the mount back to NULL while holding the
mutex in the unmount and checking the appropriate entry is non-NULL after
acquiring the mutex in the nfsiod thread.

Reported and tested by:	pho
Reviewed by:	kib
MFC after:	2 weeks
2013-04-18 23:20:16 +00:00
Rick Macklem
175b3f31d3 Both NFS clients can deadlock when using the "rdirplus" mount
option. This can occur when an nfsiod thread that already holds
a buffer lock attempts to acquire a vnode lock on an entry in
the directory (a LOR) when another thread holding the vnode lock
is waiting on an nfsiod thread. This patch avoids the deadlock by disabling
readahead for this case, so the nfsiod threads never do readdirplus.
Since readaheads for directories need the directory offset cookie
from the previous read, they cannot normally happen in parallel.
As such, testing by jhb@ and myself didn't find any performance
degredation when this patch is applied. If there is a case where
this results in a significant performance degradation, mounting
without the "rdirplus" option can be done to re-enable readahead
for directories.

Reported and tested by:	jhb
Reviewed by:	jhb
MFC after:	2 weeks
2013-04-18 13:09:04 +00:00
Kenneth D. Merry
adb974068b Move the NFS FHA (File Handle Affinity) code from sys/nfsserver to
sys/nfs, since it is now shared by the two NFS servers.

Suggested by:	rmacklem
Sponsored by:	Spectra Logic
MFC after:	2 weeks
2013-04-17 22:42:43 +00:00