Commit Graph

3516 Commits

Author SHA1 Message Date
Konstantin Belousov
538ee0d74e Remove mistakenly merged field.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-19 20:03:26 +00:00
Konstantin Belousov
00ac6a98d8 Add mount option for tmpfs(5) to not use namecache.
The option "nonc" disables using of namecache for the created mount,
by default namecache is used.  The rationale for the option is that
namecache duplicates the information which is already kept in memory
by tmpfs.  Since it believed that namecache scales better than tmpfs,
or will scale better, do not enable the option by default.  On the
other hand, smaller machines may benefit from lesser namecache
pressure.

Discussed with:	mjg
Tested by:	pho (as part of larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-01-19 19:46:49 +00:00
Konstantin Belousov
08c053e71c Implement VOP_VPTOCNP() for tmpfs.
For directories, node->tn_spec.tn_dir.tn_parent pointer to the parent
is used.  For non-directories, the implementation is naive, all
directory nodes are scanned to find a dirent linking the specified
node.  This can be significantly improved by maintaining tn_parent for
all nodes, later.

Tested by:	pho (as part of larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-01-19 19:29:13 +00:00
Konstantin Belousov
b4ba3b6459 VNON nodes cannot exist.
Tested by:	pho (as part of larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-01-19 19:25:42 +00:00
Konstantin Belousov
64c250439f Refcount tmpfs nodes and mount structures.
On dotdot lookup and fhtovp operations, it is possible for the file
represented by tmpfs node to be removed after the thread calculated
the pointer.  In this case, tmpfs_alloc_vp() accesses freed memory.

Introduce the reference count on the nodes.  The allnodes list from
tmpfs mount owns 1 reference, and threads performing unlocked
operations on the node, add one transient reference.  Similarly, since
struct tmpfs_mount maintains the list where nodes are enlisted,
refcount it by one reference from struct mount and one reference from
each node on the list.  Both nodes and tmpfs_mounts are removed when
refcount goes to zero.

Note that this means that nodes and tmpfs_mounts might survive some
time after the node is deleted or tmpfs_unmount() finished.  The
tmpfs_alloc_vp() in these cases returns error either due to node
removal (tn_nlinks == 0) or because of insmntque1(9) error.

Tested by:	pho (as part of larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-01-19 19:15:21 +00:00
Konstantin Belousov
1c07d69bc2 Make tmpfs directory cursor available outside tmpfs_subr.c.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-19 18:38:58 +00:00
Konstantin Belousov
280ffa5ed7 Rename tmpfs_mount member allnode_lock to include namespace prefix.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-19 16:01:36 +00:00
Konstantin Belousov
4960d0d453 Protect macro argument.
Requested by:	hselasky
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-19 15:06:18 +00:00
Konstantin Belousov
e7e6c82067 Rework some tmpfs lock assertions.
Remove TMPFS_ASSERT_ELOCKED().  Its claims are already stated by other
asserts nearby and by VFS guarantees.
Change TMPFS_ASSERT_LOCKED() and one inlined place to use
ASSERT_VOP_(E)LOCKED() instead of hand-rolled imprecise asserts.

Tested by:	pho (as part of the larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-19 14:49:55 +00:00
Konstantin Belousov
bba7ed2054 Style fixes and comment updates.
Edit comments which explain no longer relevant details, and add
locking annotations to the struct tmpfs_node members.

Tested by:	pho (as part of the larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-19 14:27:37 +00:00
Konstantin Belousov
9e3ff5c594 Remove unused union member, fifos on tmpfs are implemented in common code.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-19 13:35:14 +00:00
Mateusz Guzik
ed2159c92c tmpfs: manage tm_pages_used with atomics
Reviewed by:	kib (previous version)
2017-01-14 06:20:36 +00:00
Mateusz Guzik
bedd46202f cd9660: fix up compilation on sparc after r311665
Reported by:	linimon
2017-01-10 04:17:53 +00:00
Conrad Meyer
f7da1444fd cd9660: typedef cd_ino_t in preference to #define
Suggested by:	kib@
2017-01-09 23:56:45 +00:00
Conrad Meyer
d5fadb019e cd9660: Add a prototype for cd9660_vfs_hash_cmp
GCC warns (and errors, with -Werror) about it otherwise.  Clang doesn't care.

Introduced in r311665.

Reported by:	np@
2017-01-09 23:51:31 +00:00
Konstantin Belousov
d72b4d3918 Forcibly remove the cached items from pseudofs vncache on module unload.
If some process' nodes were accessed using procfs and the process
cannot exit properly at the time modunload event is reported to the
pseudofs-backed filesystem, the assertion in pfs_vncache_unload() is
triggered.  Assertion is correct, the cache should be cleaned.

Approved by:	des (pseudofs maintainer)
Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-09 20:14:18 +00:00
Conrad Meyer
55cfad428b iso_rrip.h: Hide kernel definitions from makefs(8)
Reported by:	O. Hartmann <ohartmann at walstatt.org>
2017-01-08 09:16:07 +00:00
Conrad Meyer
c8227de25f Do not truncate inode calculation from ISO9660 block offset
PR:		190655
Reported by:	Thomas Schmitt <scdbackup at gmx.net>
Obtained from:	NetBSD sys/fs/cd9660/cd9660_node.c,r1.31
2017-01-08 06:22:35 +00:00
Conrad Meyer
dbaab6e66f cd9660: Expand internal inum size to 64 bits
Inums in cd9660 refer to byte offsets on the media.  DVD and BD media
can have entries above 4GB, especially with multi-session images.

PR:		190655
Reported by:	Thomas Schmitt <scdbackup at gmx.net>
2017-01-08 06:21:49 +00:00
Mateusz Guzik
3b622fc857 tmpfs: perform a lockless check in tmpfs_itimes
Most of the time the status is 0 as the function is repeatedly
called from tmpfs_getattr.
2017-01-06 19:58:20 +00:00
Mateusz Guzik
31e73fd434 tmpfs: enabled MNTK_EXTENDED_SHARED
Discussed with:	kib
2017-01-06 18:01:46 +00:00
Konstantin Belousov
5dc1128656 Lock tmpfs node tn_status updates done under the shared vnode lock.
If tmpfs vnode is only shared locked, tn_status field still needs
updates to note the access time modification.  Use the same locking
scheme as for UFS, protect tn_status with the node interlock + shared
vnode lock.

Fix nearby style.

Noted and reviewed by:	mjg
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-06 17:43:36 +00:00
Konstantin Belousov
305b422966 Use vnode lock assertion expression, and upgrade it to assert the
required exclusive state of the vnode lock in tmpfs chflags, chmod,
chown, chsize, chtimes operations.

Fix nearby style.

Reviewed by:	mjg
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-06 17:32:44 +00:00
Konstantin Belousov
9a4d5dbbac Remove dead code.
Fifos overwrite file ops vector, and fifo VOP_KQFILTER is VOP_PANIC().

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-06 17:03:08 +00:00
Konstantin Belousov
1c32456953 Use type-independent formats for printing nlink_t and ino_t.
Extracted from:	ino64 work by gleb, mckusick
Discussed with:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-06 16:59:33 +00:00
Konstantin Belousov
2f304845e2 Do not allocate struct statfs on kernel stack.
Right now size of the structure is 472 bytes on amd64, which is
already large and stack allocations are indesirable.  With the ino64
work, MNAMELEN is increased to 1024, which will make it impossible to have
struct statfs on the stack.

Extracted from:	ino64 work by gleb
Discussed with:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-05 17:19:26 +00:00
Josh Paetzel
b5a8f340f1 Workaround NFS bug with readdirplus when there are greater than 1 billion files in a filesystem.
Reviewed by	kib
MFC after:	2 weeks
Sponsored by:	iXsystems
Differential Revision:	D9009
2017-01-02 19:18:56 +00:00
Pedro F. Giffuni
a86fdf887e Undo small wrong style change.
Reported by:	kib
2016-12-28 16:16:36 +00:00
Pedro F. Giffuni
bf9a211dff style(9) cleanups.
Just to reduce some of the issues found with indent(1).

MFC after:	1 week
2016-12-28 15:43:17 +00:00
Rick Macklem
b2fc0141d9 Fix NFSv4.1 client recovery from NFS4ERR_BAD_SESSION errors.
For most NFSv4.1 servers, a NFS4ERR_BAD_SESSION error is a rare failure
that indicates that the server has lost session/open/lock state.
However, recent testing by cperciva@ against the AmazonEFS server found
several problems with client recovery from this due to it generating this
failure frequently.
Briefly, the problems fixed are:
- If all session slots were in use at the time of the failure, some processes
  would continue to loop waiting for a slot on the old session forever.
- If an RPC that doesn't use open/lock state failed with NFS4ERR_BAD_SESSION,
  it would fail the RPC/syscall instead of initiating recovery and then
  looping to retry the RPC.
- If a successful reply to an RPC for an old session wasn't processed
  until after a new session was created for a NFS4ERR_BAD_SESSION error,
  it would erroneously update the new session and corrupt it.
- The use of the first element of the session list in the nfs mount
  structure (which is always the current metadata session) was slightly
  racey. With changes for the above problems it became more racey, so all
  uses of this head pointer was wrapped with a NFSLOCKMNT()/NFSUNLOCKMNT().
- Although the kernel malloc() usually allocates more bytes than requested
  and, as such, this wouldn't have caused problems, the allocation of a
  session structure was 1 byte smaller than it should have been.
  (Null termination byte for the string not included in byte count.)

There are probably still problems with a pNFS data server that fails
with NFS4ERR_BAD_SESSION, but I have no server that does this to test
against (the AmazonEFS server doesn't do pNFS), so I can't fix these yet.

Although this patch is fairly large, it should only affect the handling
of NFS4ERR_BAD_SESSION error replies from an NFSv4.1 server.
Thanks go to cperciva@ for the extension testing he did to help isolate/fix
these problems.

Reported by:	cperciva
Tested by:	cperciva
MFC after:	3 months
Differential Revision:	https://reviews.freebsd.org/D8745
2016-12-23 23:14:53 +00:00
Alan Cox
2d612d2dd2 When tmpfs and POSIX shm pagein a page for the sole purpose of performing
truncation, immediately queue the page for asynchronous laundering rather
than making the page pass through inactive queue first.

Reviewed by:	kib, markj
2016-12-11 19:24:41 +00:00
Rick Macklem
a5d19b81b4 Fix the NFSv4.1 server for Open reclaim after a reboot.
The NFSv4.1 server failed to update the nfs-stablerestart file for
a client when the client was issued its first Open. As such, recovery
of Opens after a server reboot failed with NFSERR_NOGRACE.
This patch fixes this.
It also changes the code so that it malloc()'s the 1024 byte array
instead of allocating it on the kernel stack for both NFSv4.0 and NFSv4.1.
Note that this bug only affected NFSv4.1 and only when clients attempted
to reclaim Opens after a server reboot.

MFC after:	2 weeks
2016-12-05 22:36:25 +00:00
Pedro F. Giffuni
fca15474a0 ext2fs: renumber the license clauses to avoid skipping #3.
This is to keep consistency with other files, and help license-checking
utilities determine the number of clauses that apply.

No functional change.
2016-12-02 19:47:23 +00:00
Konstantin Belousov
abc1515601 NFSv4 client tracks opens, and the track records are only dropped when
the vnode is inactivated.  This contradicts with the nullfs caching
which keeps upper vnode around, as consequence keeping the use
reference to lower vnode.

Add a filesystem flag to request nullfs to not cache when mounted over
that filesystem, and set the flag for nfs v4 mounts.

Reported by:	asomers
Reviewed by:	rmacklem
Tested by:	asomers, rmacklem
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-11-27 09:20:58 +00:00
Pedro F. Giffuni
bb9535bbc7 ext2: avoid possible overflow when calculating malloc size.
This is inspired on r308064 for case of reloading UFS.

MFC after:	1 week
2016-11-26 02:06:33 +00:00
Rick Macklem
1a2079d936 Stop "nfsstat -z" from clearing counts of NFSv4 state structures.
The "-z" option on nfsstats was erroneously zeroing out the counts
of NFSv4 state structures. These counts will normally go back down
to zero as state is released. When zeroed out by "-z", these counts
can go negative. This patch fixes this problem.

MFC after:	2 weeks
2016-11-25 23:28:09 +00:00
Mark Johnston
99e6e1930c Release laundered vnode pages to the head of the inactive queue.
The swap pager enqueues laundered pages near the head of the inactive queue
to avoid another trip through LRU before reclamation. This change adds
support for this behaviour to the vnode pager and makes use of it in UFS and
ext2fs. Some ioflag handling is consolidated into a common subroutine so
that this support can be easily extended to other filesystems which make use
of the buffer cache. No changes are needed for ZFS since its putpages
routine always undirties the pages before returning, and the laundry
thread requeues the pages appropriately in this case.

Reviewed by:	alc, kib
Differential Revision:	https://reviews.freebsd.org/D8589
2016-11-23 17:53:07 +00:00
Alan Cox
bba39b9ae3 Remove PG_CACHED-related fields from struct vmmeter, because they are no
longer used.  More precisely, they are always zero because the code that
decremented and incremented them no longer exists.

Bump __FreeBSD_version to mark this change.

Reviewed by:	kib, markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D8583
2016-11-22 18:13:46 +00:00
Konstantin Belousov
1fa81dab7d On error, bread(9) zeroes buffer pointer, do not dereference it.
See r294954 for the bread(9) change and r297401 for similar cd9660 fix.

Reported and tested by:	Joshua Kinard <kumba@gentoo.org>
PR:	214705
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-11-22 13:24:57 +00:00
Konstantin Belousov
753a007f0d Use buffer pager for NFS.
The pager, due to its construction, implements clustering for the
page-ins.  In particular, buildworld load demonstrates reduction of
the READ RPCs from 39k down to 24k.  No change in real or CPU time was
observed.

Discussed with, and measured by:	bde
No objections from:	rmacklem
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-11-22 10:58:24 +00:00
Konstantin Belousov
fc2c3afee0 Minor cleanup, remove unneeded XXX comments and unused re-define.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-11-22 10:24:59 +00:00
Colin Percival
63659ba6df Reduce NFS "NFSv4( mounted on)? fileid > 32bits" log spam.
Rather than printing a warning for every time we receive a fileid > 2^32
from the NFS server, count warnings and print at most one of each warning
type per minute, e.g.,

Nov 15 05:17:34 ip-172-30-1-221 kernel: NFSv4 fileid > 32bits (24730 occurrences)
Nov 15 05:17:56 ip-172-30-1-221 kernel: NFSv4 mounted on fileid > 32bits (178 occurrences)
Nov 15 05:18:53 ip-172-30-1-221 kernel: NFSv4 fileid > 32bits (7582 occurrences)
Nov 15 05:18:58 ip-172-30-1-221 kernel: NFSv4 mounted on fileid > 32bits (23 occurrences)

A buildworld with an NFS mounted /usr/obj can otherwise result in
hundreds of thousands of lines being printed, which seems unnecessarily
verbose.

When ino_t becomes a 64-bit type, these printfs will no longer be needed
(and the problems associated with truncating 64-bit fileids to generate
32-bit inode numbers will also go away).

Reviewed by:	rmacklem
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D8523
2016-11-16 01:11:49 +00:00
Alan Cox
7667839a7e Remove most of the code for implementing PG_CACHED pages. (This change does
not remove user-space visible fields from vm_cnt or all of the references to
cached pages from comments.  Those changes will come later.)

Reviewed by:	kib, markj
Tested by:	pho
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D8497
2016-11-15 18:22:50 +00:00
Edward Tomasz Napierala
53232c0d1d Remove spurious space.
MFC after:	1 month
2016-11-13 12:06:25 +00:00
Bryan Drewery
28323add09 Fix improper use of "its".
Sponsored by:	Dell EMC Isilon
2016-11-08 23:59:41 +00:00
Edward Tomasz Napierala
a79e9d0fec Value returned by taskqueue_enqueue_timeout(9) is not an error; don't treat
it as such.

MFC after:	1 month
2016-11-05 12:30:10 +00:00
Konstantin Belousov
7359fdcf5f Allow some dotdot lookups in capability mode.
If dotdot lookup does not escape from the file descriptor passed as
the lookup root, we can allow the component traversal.  Track the
directories traversed, and check the result of dotdot lookup against
the recorded list of the directory vnodes.

Dotdot lookups are enabled by sysctl vfs.lookup_cap_dotdot, currently
disabled by default until more verification of the approach is done.

Disallow non-local filesystems for dotdot, since remote server might
conspire with the local process to allow it to escape the namespace.
This might be too cautious, provide the knob
vfs.lookup_cap_dotdot_nonlocal to override as well.

Idea by:	rwatson
Discussed with:	emaste, jonathan, rwatson
Reviewed by:	mjg (previous version)
Tested by:	pho (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 week
Differential revision:	https://reviews.freebsd.org/D8110
2016-11-02 12:43:15 +00:00
Konstantin Belousov
c329ee711b Use buffer pager for cd9660.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-10-28 11:46:39 +00:00
Konstantin Belousov
06965e96b3 Use buffer pager for msdosfs.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-10-28 11:46:15 +00:00
Konstantin Belousov
2aa3944510 Enable vn_io_fault() deadlock avoidance for msdosfs.
Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-10-28 11:35:06 +00:00