53 Commits

Author SHA1 Message Date
Konstantin Belousov
00ac6a98d8 Add mount option for tmpfs(5) to not use namecache.
The option "nonc" disables using of namecache for the created mount,
by default namecache is used.  The rationale for the option is that
namecache duplicates the information which is already kept in memory
by tmpfs.  Since it believed that namecache scales better than tmpfs,
or will scale better, do not enable the option by default.  On the
other hand, smaller machines may benefit from lesser namecache
pressure.

Discussed with:	mjg
Tested by:	pho (as part of larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-01-19 19:46:49 +00:00
Konstantin Belousov
64c250439f Refcount tmpfs nodes and mount structures.
On dotdot lookup and fhtovp operations, it is possible for the file
represented by tmpfs node to be removed after the thread calculated
the pointer.  In this case, tmpfs_alloc_vp() accesses freed memory.

Introduce the reference count on the nodes.  The allnodes list from
tmpfs mount owns 1 reference, and threads performing unlocked
operations on the node, add one transient reference.  Similarly, since
struct tmpfs_mount maintains the list where nodes are enlisted,
refcount it by one reference from struct mount and one reference from
each node on the list.  Both nodes and tmpfs_mounts are removed when
refcount goes to zero.

Note that this means that nodes and tmpfs_mounts might survive some
time after the node is deleted or tmpfs_unmount() finished.  The
tmpfs_alloc_vp() in these cases returns error either due to node
removal (tn_nlinks == 0) or because of insmntque1(9) error.

Tested by:	pho (as part of larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-01-19 19:15:21 +00:00
Konstantin Belousov
280ffa5ed7 Rename tmpfs_mount member allnode_lock to include namespace prefix.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-19 16:01:36 +00:00
Konstantin Belousov
bba7ed2054 Style fixes and comment updates.
Edit comments which explain no longer relevant details, and add
locking annotations to the struct tmpfs_node members.

Tested by:	pho (as part of the larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-19 14:27:37 +00:00
Mateusz Guzik
ed2159c92c tmpfs: manage tm_pages_used with atomics
Reviewed by:	kib (previous version)
2017-01-14 06:20:36 +00:00
Mateusz Guzik
31e73fd434 tmpfs: enabled MNTK_EXTENDED_SHARED
Discussed with:	kib
2017-01-06 18:01:46 +00:00
Mark Johnston
5f34e93c58 Check suspendability on the mountpoint returned by VOP_GETWRITEMOUNT.
This obviates the need for a MNTK_SUSPENDABLE flag, since passthrough
filesystems like nullfs and unionfs no longer need to inherit this
information from their lower layer(s). This change also restores the
pre-r273336 behaviour of using the presence of a susp_clean VFS method to
request suspension support.

Reviewed by:	kib, mjg
Differential Revision:	https://reviews.freebsd.org/D2937
2015-07-05 22:37:33 +00:00
Konstantin Belousov
f40cb1c645 Update mtime for tmpfs files modified through memory mapping. Similar
to UFS, perform updates during syncer scans, which in particular means
that tmpfs now performs scan on sync.  Also, this means that a mtime
update may be delayed up to 30 seconds after the write.

The vm_object' OBJ_TMPFS_DIRTY flag for tmpfs swap object is similar
to the OBJ_MIGHTBEDIRTY flag for the vnode object, it indicates that
object could have been dirtied.  Adapt fast page fault handler and
vm_object_set_writeable_dirty() to handle OBJ_TMPFS_NODE same as
OBJT_VNODE.

Reported by:	Ronald Klop <ronald-lists@klop.ws>
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2015-01-28 10:37:23 +00:00
Konstantin Belousov
3544b0f68f tmpfs does not use UVM on FreeBSD.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2015-01-28 10:25:35 +00:00
Mateusz Guzik
12e2a30ef9 tmpfs: allow shared file lookups
Tested by: pho
2014-10-21 21:27:13 +00:00
Mateusz Guzik
4fce16e4c9 Provide vfs suspension support only for filesystems which need it, take
two.

nullfs and unionfs need to request suspension if underlying filesystem(s)
use it. Utilize mnt_kern_flag for this purpose.

This is a fixup for 273271.

No strong objections from: kib
Pointy hat to: mjg
MFC after:	2 weeks
2014-10-20 18:00:50 +00:00
Mateusz Guzik
020b8f17a0 Provide vfs suspension support only for filesystems which need it.
Need is expressed by providing vfs_susp_clean function in vfsops.

Differential Revision:	D952
Reviewed by:	kib (previous version)
MFC after:	2 weeks
2014-10-19 06:59:33 +00:00
Konstantin Belousov
4cda7f7ece Rework the tmpfs unmount.
- Suspend filesystem for unmount.  This prevents new tmpfs nodes from
  instantiating, and also ensures that only unmount thread can destroy
  nodes.

- Do not start tmpfs node deletion until all vnodes are reclaimed,
  which guarantees that no thread can access tmpfs data.  For this,
  call vflush() in the loop, until the mnt_nvnodelistsize is non-zero.
  Note that after mnt_nvnodelistsize becomes 0, insmntque() blocks
  insertion of a vnode germ into the mount list of vnodes.

- Fail node allocation when the filesystem is being unmounted.  This
  is race-free due to the vflush() call in loop.  This is mostly
  cosmetic, avoiding some more work which might be done until
  suspension in unmount is started.

Note that there is currently no way to prevent new vnode instantiation
from readers during the unmount.  Due to this, forced unmount might
live-lock if vflush() loop cannot get to the zero vnode count due to
races with readers.  The unmount would proceed after the load is
lifted.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 09:52:33 +00:00
Konstantin Belousov
fca015d301 Remove code separator lines which do not conform to style(9).
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-14 08:17:11 +00:00
Bryan Drewery
0742ebc98f Fix -o size less than PAGE_SIZE resulting in SIZE_MAX being used.
Discussed with:	kib
MFC after:	2 weeks
2014-03-14 01:43:55 +00:00
Xin LI
2454886e05 Allow tmpfs be mounted inside jail. 2013-08-23 22:52:20 +00:00
Nathan Whitehorn
59169d9156 tmpfs works perfectly fine with -o union -- there is no reason to exclude it
from the list of options.
2013-07-23 14:48:37 +00:00
Gleb Kurtsou
4fd5efe79e tmpfs: Replace directory entry linked list with RB-Tree.
Use file name hash as a tree key, handle duplicate keys.  Both VOP_LOOKUP
and VOP_READDIR operations utilize same tree for search.  Directory
entry offset (cookie) is either file name hash or incremental id in case
of hash collisions (duplicate-cookies).  Keep sorted per directory list
of duplicate-cookie entries to facilitate cookie number allocation.

Don't fail if previous VOP_READDIR() offset is no longer valid, start
with next dirent instead.  Other file system handle it similarly.

Workaround race prone tn_readdir_last[pn] fields update.

Add tmpfs_dir_destroy() to free all dirents.

Set NFS cookies in tmpfs_dir_getdents(). Return EJUSTRETURN from
tmpfs_dir_getdents() instead of hard coded -1.

Mark directory traversal routines static as they are no longer
used outside of tmpfs_subr.c
2013-01-06 22:15:44 +00:00
Attilio Rao
bc2258da88 Complete MPSAFE VFS interface and remove MNTK_MPSAFE flag.
Porters should refer to __FreeBSD_version 1000021 for this change as
it may have happened at the same timeframe.
2012-11-09 18:02:25 +00:00
Matthew D Fleming
fc8fdae0df Fix up kernel sources to be ready for a 64-bit ino_t.
Original code by:	Gleb Kurtsou
2012-09-27 23:30:49 +00:00
Jaakko Heinonen
c5ab5ce345 tmpfs: Allow update mounts only for certain options.
Since r230208 update mounts were allowed if the list of mount options
contained the "export" option. This is not correct as tmpfs doesn't
really support updating all options.

Reviewed by:	kevlo, trociny
2012-04-16 18:07:42 +00:00
Gleb Kurtsou
9295c62814 tmpfs supports only INT_MAX nodes due to limitations of unit number
allocator.

Replace UINT32_MAX checks with INT_MAX. Keeping more than 2^31 nodes in
memory is not likely to become possible in foreseeable feature and would
require new unit number allocator.

Discussed with: delphij
MFC after:	2 weeks
2012-04-07 15:30:46 +00:00
Gleb Kurtsou
0ff93c48da Add vfs_getopt_size. Support human readable file system options in tmpfs.
Increase maximum tmpfs file system size to 4GB*PAGE_SIZE on 32 bit archs.

Discussed with:	delphij
MFC after:	2 weeks
2012-04-07 15:27:34 +00:00
Gleb Kurtsou
da7aa2778e Add reserved memory limit sysctl to tmpfs.
Cleanup availble and used memory functions.
Check if free pages available before allocating new node.

Discussed with:	delphij
2012-04-07 15:23:51 +00:00
Kevin Lo
e0d3195bd6 Return EOPNOTSUPP since we only support update mounts for NFS export.
Spotted by:	trociny
2012-01-17 01:25:53 +00:00
Kevin Lo
57eb5548c9 Add nfs export support to tmpfs(5)
Reviewed by:	kib
2012-01-16 10:25:22 +00:00
Marcel Moolenaar
82543c5928 Don astbestos garment and remove the warning about TMPFS being experimental
-- highly experimental even. So far the closest to a bug in TMPFS that people
have gotten to relates to how ZFS can take away from the memory that TMPFS
needs. One can argue that such is not a bug in TMPFS. Irrespective, even if
there is a bug here and there in TMPFS, it's not in our own advantage to
scare people away from using TMPFS. I for one have been using it, even with
ZFS, very successfully.
2011-11-07 16:21:50 +00:00
Rick Macklem
694a586a43 Add a lock flags argument to the VFS_FHTOVP() file system
method, so that callers can indicate the minimum vnode
locking requirement. This will allow some file systems to choose
to return a LK_SHARED locked vnode when LK_SHARED is specified
for the flags argument. This patch only adds the flag. It
does not change any file system to use it and all callers
specify LK_EXCLUSIVE, so file system semantics are not changed.

Reviewed by:	kib
2011-05-22 01:07:54 +00:00
Jaakko Heinonen
dec3772ee4 Add "maxfilesize" mount option for tmpfs to allow specifying the
maximum file size limit. Default is UINT64_MAX when the option is
not specified. It was useless to set the limit to the total amount of
memory and swap in the system.

Use tmpfs_mem_info() rather than get_swpgtotal() in tmpfs_mount() to
check if there is enough memory available.

Remove now unused get_swpgtotal().

Reviewed by:	Gleb Kurtsou
Approved by:	trasz (mentor)
2010-01-29 12:09:14 +00:00
Jaakko Heinonen
189ee6be40 - Change the type of nodes_max to u_int and use "%u" format string to
convert its value. [1]
- Set default tm_nodes_max to min(pages + 3, UINT32_MAX). It's more
  reasonable than the old four nodes per page (with page size 4096) because
  non-empty regular files always use at least one page. This fixes possible
  overflow in the calculation. [2]
- Don't allow more than tm_nodes_max nodes allocated in tmpfs_alloc_node().

PR:		kern/138367
Suggested by:	bde [1], Gleb Kurtsou [2]
Approved by:	trasz (mentor)
2010-01-20 16:56:20 +00:00
Jaakko Heinonen
5364a38dba - Fix some style bugs in tmpfs_mount(). [1]
- Remove a stale comment about tmpfs_mem_info() 'total' argument.

Reported by:	bde [1]
2010-01-13 14:17:21 +00:00
Jaakko Heinonen
720c50b339 - Change the type of size_max to u_quad_t because its value is converted
with vfs_scanopt(9) using the "%qu" format string.
- Limit the maximum value of size_max to (SIZE_MAX - PAGE_SIZE) to
  prevent overflow in howmany() macro.

PR:		kern/141194
Approved by:	trasz (mentor)
MFC after:	2 weeks
2010-01-08 07:57:43 +00:00
Attilio Rao
dfd233edd5 Remove the thread argument from the FSD (File-System Dependent) parts of
the VFS.  Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.

In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.

While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.

VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled.  Bump __FreeBSD_version in order to signal such
situation.
2009-05-11 15:33:26 +00:00
Xin LI
e08d55674d Reflect license change of NetBSD code.
Obtained from:	NetBSD
MFC after:	3 days
2008-09-03 18:53:48 +00:00
Attilio Rao
0359a12ead Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread
was always curthread and totally unuseful.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
2008-08-28 15:23:18 +00:00
Konstantin Belousov
eab626f110 Move the head of byte-level advisory lock list from the
filesystem-specific vnode data to the struct vnode. Provide the
default implementation for the vop_advlock and vop_advlockasync.
Purge the locks on the vnode reclaim by using the lf_purgelocks().
The default implementation is augmented for the nfs and smbfs.
In the nfs_advlock, push the Giant inside the nfs_dolock.

Before the change, the vop_advlock and vop_advlockasync have taken the
unlocked vnode and dereferenced the fs-private inode data, racing with
with the vnode reclamation due to forced unmount. Now, the vop_getattr
under the shared vnode lock is used to obtain the inode size, and
later, in the lf_advlockasync, after locking the vnode interlock, the
VI_DOOMED flag is checked to prevent an operation on the doomed vnode.

The implementation of the lf_purgelocks() is submitted by dfr.

Reported by:	kris
Tested by:	kris, pho
Discussed with:	jeff, dfr
MFC after:	2 weeks
2008-04-16 11:33:32 +00:00
Attilio Rao
22db15c06f VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
2008-01-13 14:44:15 +00:00
Attilio Rao
cb05b60a89 vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by:	Diego Sardina <siarodx at gmail dot com>,
		Andrea Di Pasquale <whyx dot it at gmail dot com>
2008-01-10 01:10:58 +00:00
Xin LI
745973bd99 size_max should be unsigned, as such, use size_t here. 2007-12-06 23:19:05 +00:00
Xin LI
7871e52bfd MFp4: Several fixes to tmpfs which makes it to survive from pho@'s
strees2 suite, to quote his letter, this change:

1. It removes the tn_lookup_dirent stuff. I think this cannot be fixed,
   because nothing protects vnode/tmpfs node between lookup is done, and
   actual operation is performed, in the case the vnode lock is dropped.
   At least, this is the case with the from vnode for rename.

   For now, we do the linear lookup in the parent node. This has its own
   drawbacks. Not mentioning speed (that could be fixed by using hash), the
   real problem is the situation where several hardlinks exist in the dvp.
   But, I think this is fixable.

2. The patch restores the VV_ROOT flag on the root vnode after it became
   reclaimed and allocated again. This fixes MPASS assertion at the start
   of the tmpfs_lookup() reported by many.

Submitted by:	kib
2007-11-18 04:52:40 +00:00
Xin LI
e0f51ae7cd MFp4: Fix several style(9) bugs.
Submitted by:	des
2007-11-18 04:40:42 +00:00
Xin LI
eed4ee29e5 Correct a stack overflow which will trigger panics when
mode= is specified, caused by incorrect format string
specified to vfs_scanopt() and subsequently vsscanf().

Pointed out by:	kib
Submitted by:	des
2007-11-12 18:57:33 +00:00
Xin LI
3543c1b429 MFp4: Provide a dummy verb "export" to shut up the message
showed up at start when NFS is enabled.

Reported by:	rafan
Approved by:	re (tmpfs blanket)
2007-10-04 17:11:48 +00:00
Xin LI
386c969205 Additional work is still needed before we can claim that tmpfs
is stable enough for production usage.  Warn user upon mount.

Approved by:	re (tmpfs blanket)
2007-10-04 17:08:46 +00:00
Xin LI
0ae6383d39 MFp4:
- Respect cnflag and don't lock vnode always as LK_EXCLUSIVE [1]
 - Properly lock around tn_vnode to avoid NULL deference
 - Be more careful handling vnodes (*)

(*) This is a WIP
[1] by pjd via howardsu

Thanks kib@ for his valuable VFS related comments.

Tested with:	fsx, fstest, tmpfs regression test set
Found by:	pho's stress2 suite
Approved by:	re (tmpfs blanket)
2007-08-10 05:24:49 +00:00
Xin LI
f62e5595fd MFp4: Force 64-bit arithmatic when caculating the maximum file size.
This fixes tmpfs caculations on 32-bit systems equipped with more than
4GB swap.

Reported by:	Craig Boston <craig xfoil gank org>
PR:		kern/114870
Approved by:	re (tmpfs blanket)
2007-07-24 17:14:53 +00:00
Xin LI
7280082944 MFp4: When swapping is not enabled, allow creating files by taking
physical memory pages into account for tm_maxfilesize.

Reported by:	Dominique Goncalves <dominique.goncalves gmail.com>
Submitted by:	Howard Su
Approved by:	re (tmpfs blanket)
2007-07-23 06:54:58 +00:00
Xin LI
8d9a89a3a0 MFp4: Make use of the kernel unit number allocation facility
for tmpfs nodes.

Submitted by:	Mingyan Guo <guomingyan gmail com>
Approved by:	re (tmpfs blanket)
2007-07-11 14:26:27 +00:00
Xin LI
1df86a323d MFp4:
- Plug memory leak.
 - Respect underlying vnode's properties rather than assuming that
   the user want root:wheel + 0755.  Useful for using tmpfs(5) for
   /tmp.
 - Use roundup2 and howmany macros instead of rolling our own version.
 - Try to fix fsx -W -R foo case.
 - Instead of blindly zeroing a page, determine whether we need a pagein
   order to prevent data corruption.
 - Fix several bugs reported by Coverity.

Submitted by:	Mingyan Guo <guomingyan gmail com>, Howard Su, delphij
Coverity ID:	CID 2550, 2551, 2552, 2557
Approved by:	re (tmpfs blanket)
2007-07-08 15:56:12 +00:00
Xin LI
9b258fca27 MFp4:
- Remove unnecessary NULL checks after M_WAITOK allocations.
 - Use VOP_ACCESS instead of hand-rolled suser_cred()
   calls. [1]
 - Use malloc(9) KPI to allocate memory for string.  The
   optimization taken from NetBSD is not valid for FreeBSD
   because our malloc(9) already act that way. [2]

Requested by:	rwatson [1]
Submitted by:	Howard Su [2]
Approved by:	re (tmpfs blanket)
2007-06-29 05:23:15 +00:00