Commit Graph

295 Commits

Author SHA1 Message Date
Mateusz Guzik
f4aa64528e tmpfs: save on relocking the allnode lock in tmpfs_free_node_locked 2021-05-31 23:21:15 +00:00
Mateusz Guzik
331a7601c9 tmpfs: save on common case relocking in tmpfs_reclaim 2021-05-29 22:04:10 +00:00
Mateusz Guzik
439d942b9e tmpfs: drop a redundant NULL check in tmpfs_alloc_vp 2021-05-29 22:04:10 +00:00
Mateusz Guzik
7fbeaf33b8 tmpfs: drop useless parent locking from tmpfs_dir_getdotdotdent
The id field is immutable until the node gets freed.
2021-05-29 22:04:10 +00:00
Mateusz Guzik
eec2e4ef7f tmpfs: reimplement the mtime scan to use the lazy list
Tested by:	pho
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D30065
2021-05-15 20:48:45 +00:00
Mateusz Guzik
128e25842e vm: add another pager private flag
Move OBJ_SHADOWLIST around to let pager flags be next to each other.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D30258
2021-05-15 20:47:29 +00:00
Konstantin Belousov
28bc23ab92 tmpfs: dynamically register tmpfs pager
Remove OBJT_SWAP_TMPFS. Move tmpfs-specific swap pager bits into
tmpfs_subr.c.

There is no longer any code to directly support tmpfs in sys/vm, most
tmpfs knowledge is shared by non-anon swap object type implementation.
The tmpfs-specific methods are provided by registered tmpfs pager, which
inherits from the swap pager.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30168
2021-05-13 20:13:34 +03:00
Konstantin Belousov
4b8365d752 Add OBJT_SWAP_TMPFS pager
This is OBJT_SWAP pager, specialized for tmpfs.  Right now, both swap pager
and generic vm code have to explicitly handle swap objects which are tmpfs
vnode v_object, in the special ways.  Replace (almost) all such places with
proper methods.

Since VM still needs a notion of the 'swap object', regardless of its
use, add yet another type-classification flag OBJ_SWAP. Set it in
vm_object_allocate() where other type-class flags are set.

This change almost completely eliminates the knowledge of tmpfs from VM,
and opens a way to make OBJT_SWAP_TMPFS loadable from tmpfs.ko.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30070
2021-05-07 17:08:03 +03:00
Alex Richardson
1d15bceae6 tmpfs: implement pathconf(_PC_SYMLINK_MAX)
This fixes one of the sys/audit tests when running them on tmpfs.

Reviewed By:	delphij, kib
Differential Revision: https://reviews.freebsd.org/D28387
2021-01-29 09:30:25 +00:00
Kyle Evans
0f919ed4ae tmpfs: push VEXEC check into tmpfs_lookup()
vfs_cache_lookup() has already done the appropriate VEXEC check, therefore
we must not re-check in VOP_CACHEDLOOKUP.

This fixes O_SEARCH semantics on tmpfs and removes a redundant descent into
VOP_ACCESS() in the common case.

Reported-by:	arichardson (via CheriBSD Jenkins CI)
Reviewed-by:	kib
MFC-after:	3 days
Differential Revision:	https://reviews.freebsd.org/D28401
2021-01-28 19:25:11 -06:00
Mateusz Guzik
c09f799271 tmpfs: drop acq fence now that vn_load_v_data_smr has consume semantics 2021-01-25 22:40:15 +00:00
Mateusz Guzik
cc96f92a57 atomic: make atomic_store_ptr type-aware 2021-01-25 22:40:15 +00:00
Mateusz Guzik
618029af50 tmpfs: add support for lockless symlink lookup
Reviewed by:	kib (previous version)
Tested by:	pho (previous version)
Differential Revision:	https://reviews.freebsd.org/D27488
2021-01-23 15:04:43 +00:00
Konstantin Belousov
2d1e4220eb tmpfs_reclaim: detach unlinked node on dereferencing.
Otherwise it is dereferenced one extra time at unmount, if it survives
long enough.  One way to hold the reference on such node is to keep it
open.

tmpfs_vptocnp() now needs to account for the possibility that unlocked
node was removed from the list.

Reported by:	danfe
Tested by:	danfe, pho
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-01-14 14:51:37 +02:00
Konstantin Belousov
685265ecfb tmpfs_reclaim: style
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2021-01-14 14:43:13 +02:00
Konstantin Belousov
ac2576b9f7 tmpfs open: assert that there is no double-init of f_data.
Sponsored by:	The FreeBSD Foundation
2021-01-10 04:48:36 +02:00
Konstantin Belousov
9f200bc47b tmpfs_free_tmp(): explicitly assert that tmp is locked
Despite TMPFS_UNLOCK() is done in both paths later, unlocking not locked
mutex provides different failure mode.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-01-10 04:48:29 +02:00
Konstantin Belousov
42bebbda9e tmpfs: make M_TMPFSMNT static to tmpfs_vfsops.c
This malloc type is only used in this file.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-01-10 04:44:55 +02:00
Mark Johnston
90f580b954 Ensure that dirent's d_off field is initialized
We have the d_off field in struct dirent for providing the seek offset
of the next directory entry.  Several filesystems were not initializing
the field, which ends up being copied out to userland.

Reported by:	Syed Faraz Abrar <faraz@elttam.com>
Reviewed by:	kib
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27792
2021-01-03 11:50:31 -05:00
Mateusz Guzik
3e506a67bb vfs: add v_irflag accessors
Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D27793
2021-01-03 06:50:06 +00:00
Mateusz Guzik
d71965127f tmpfs: use VNPASS when asserting on a vnode in tmpfs_read_pgcache 2021-01-01 03:23:01 +00:00
Mateusz Guzik
f24aa01f9d tmpfs: reorder struct tmpfs_node to shrink it by 8 bytes
The reduction (232 -> 224 bytes) allows UMA to fit one more item (17 -> 18)
per slab as reported in vm.uma.TMPFS_node.keg.ipers.
2020-11-05 11:24:45 +00:00
Mateusz Guzik
7c58c37ebb tmpfs: change tmpfs dirent zone into a malloc type
It is 64 bytes.
2020-10-30 14:07:25 +00:00
Mateusz Guzik
4bfebc8d2c cache: add cache_vop_mkdir and rename cache_rename to cache_vop_rename 2020-10-30 10:46:35 +00:00
Mateusz Guzik
25fb30bd9a vfs: drop spurious cache_purge on rmdir
The removed directory gets cache_purged which is sufficient to remove any entries
related to the parent.

Note only tmpfs, ufs and zfs are patched.
2020-10-23 15:50:49 +00:00
Eric van Gyzen
f9cc8410e1 vm_ooffset_t is now unsigned
vm_ooffset_t is now unsigned. Remove some tests for negative values,
or make other adjustments accordingly.

Reported by:	Coverity
Reviewed by:	kib markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D26214
2020-09-18 16:48:08 +00:00
Konstantin Belousov
016b7c7e39 tmpfs: restore atime updates for reads from page cache.
Split TMPFS_NODE_ACCCESSED bit into dedicated byte that can be updated
atomically without locks or (locked) atomics.

tn_update_getattr() change also contains unrelated bug fix.

Reported by:	lwhsu
PR:	249362
Reviewed by:	markj (previous version)
Discussed with:	mjg
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26451
2020-09-16 21:28:18 +00:00
Konstantin Belousov
23f9071466 Style.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2020-09-16 21:24:34 +00:00
Konstantin Belousov
081e36e760 Add tmpfs page cache read support.
Or it could be explained as lockless (for vnode lock) reads.  Reads
are performed from the node tn_obj object.  Tmpfs regular vnode object
lifecycle is significantly different from the normal OBJT_VNODE: it is
alive as far as ref_count > 0.

Ensure liveness of the tmpfs VREG node and consequently v_object
inside VOP_READ_PGCACHE by referencing tmpfs node in tmpfs_open().
Provide custom tmpfs fo_close() method on file, to ensure that close
is paired with open.

Add tmpfs VOP_READ_PGCACHE that takes advantage of all tmpfs quirks.
It is quite cheap in code size sense to support page-ins for read for
tmpfs even if we do not own tmpfs vnode lock.  Also, we can handle
holes in tmpfs node without additional efforts, and do not have
limitation of the transfer size.

Reviewed by:	markj
Discussed with and benchmarked by:	mjg (previous version)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26346
2020-09-15 22:19:16 +00:00
Konstantin Belousov
4601f5f5ee Microoptimize tmpfs node ref/unref by using atomics.
Avoid tmpfs mount and node locks when ref count is greater than zero,
which is the case until node is being destroyed by unlink or unmount.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26346
2020-09-15 22:13:21 +00:00
Mateusz Guzik
c86d2ba8a5 tmpfs: drop spurious cache_purge in tmpfs_reclaim
vgone already performs it.
2020-09-04 19:30:15 +00:00
Mateusz Guzik
586ee69f09 fs: clean up empty lines in .c and .h files 2020-09-01 21:18:40 +00:00
Mateusz Guzik
39f8815070 cache: add cache_rename, a dedicated helper to use for renames
While here make both tmpfs and ufs use it.

No fuctional changes.
2020-08-20 10:05:46 +00:00
Mateusz Guzik
1abe36567f tmpfs: use vget_prep/vget_finish instead of vget + vnode 2020-08-16 17:19:23 +00:00
Mateusz Guzik
a92a971bbb vfs: remove the thread argument from vget
It was already asserted to be curthread.

Semantic patch:

@@

expression arg1, arg2, arg3;

@@

- vget(arg1, arg2, arg3)
+ vget(arg1, arg2)
2020-08-16 17:18:54 +00:00
Mateusz Guzik
03337743db vfs: clean MNTK_FPLOOKUP if MNT_UNION is set
Elides checking it during lookup.
2020-08-10 11:51:21 +00:00
Mateusz Guzik
9a14439f2f tmpfs: add VOP_STAT handler 2020-08-07 23:07:47 +00:00
Mateusz Guzik
d292b1940c vfs: remove the obsolete privused argument from vaccess
This brings argument count down to 6, which is passable without the
stack on amd64.
2020-08-05 09:27:03 +00:00
Mateusz Guzik
172ffe702c tmpfs: add support for lockless lookup
Reviewed by:    kib
Tested by:      pho (in a patchset)
Differential Revision:	https://reviews.freebsd.org/D25580
2020-07-25 10:38:44 +00:00
Mark Johnston
84242cf68a Call swap_pager_freespace() from vm_object_page_remove().
All vm_object_page_remove() callers, except
linux_invalidate_mapping_pages() in the LinuxKPI, free swap space when
removing a range of pages from an object.  The LinuxKPI case appears to
be an unintentional omission that could result in leaked swap blocks, so
unconditionally free swap space in vm_object_page_remove() to protect
against similar bugs in the future.

Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25329
2020-06-25 15:21:21 +00:00
Ryan Moeller
693d10a291 tmpfs: Preserve alignment of struct fid fields
On 64-bit platforms, the two short fields in `struct tmpfs_fid` are padded to
the 64-bit alignment of the long field.  This pushes the offsets of the
subsequent fields by 4 bytes and makes `struct tmpfs_fid` bigger than
`struct fid`.  `tmpfs_vptofh()` casts a `struct fid *` to `struct tmpfs_fid *`,
causing 4 bytes of adjacent memory to be overwritten when the struct fields are
set.  Through several layers of indirection and embedded structs, the adjacent
memory for one particular call to `tmpfs_vptofh()` happens to be the stack
canary for `nfsrvd_compound()`.  Half of the canary ends up being clobbered,
going unnoticed until eventually the stack check fails when `nfsrvd_compound()`
returns and a panic is triggered.

Instead of duplicating fields of `struct fid` in `struct tmpfs_fid`, narrow the
struct to cover only the unique fields for tmpfs and assert at compile time
that the struct fits in the allotted space.  This way we don't have to
replicate the offsets of `struct fid` fields, we just use them directly.

Reviewed by:	kib, mav, rmacklem
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25077
2020-06-03 09:38:51 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Mateusz Guzik
074ad60a4c vfs: make write suspension mandatory
At the time opt-in was introduced adding yourself as a writer was esrializing
across the mount point. Nowadays it is fully per-cpu, the only impact being
a small single-threaded hit on top of what's there right now.

Vast majority of the overhead stems from the call to VOP_GETWRITEMOUNT which
has is done regardless.

Should someone want to microoptimize this single-threaded they can coalesce
looking the mount up with adding a write to it.
2020-02-15 13:00:39 +00:00
Konstantin Belousov
c1e84733ac tmpfs: add nomtime mount option,
which disables tracking mtime updates due to writes through the shared
mapped areas backed by tmpfs files.  This removes periodic scans which
downgrades rw mapped pages to ro to note the writes.

Suggested by:	mjg
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D23432
2020-02-04 19:05:58 +00:00
Konstantin Belousov
b66352b787 tmpfs_mount update: simplify, cache the value of VFS_TO_TMPFS() calculation.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-02-04 18:52:25 +00:00
Mateusz Guzik
2abdae33b1 tmpfs: inline tmpfs_update
It was generated to be just a jumping off point to tmpfs_itimes.

While here provide a dedicated variant for getattr since we normally don't
expect to need to the update from that caller.
2020-02-03 17:06:21 +00:00
Kyle Evans
6a5abb1ee5 Provide O_SEARCH
O_SEARCH is defined by POSIX [0] to open a directory for searching, skipping
permissions checks on the directory itself after the initial open(). This is
close to the semantics we've historically applied for O_EXEC on a directory,
which is UB according to POSIX. Conveniently, O_SEARCH on a file is also
explicitly undefined behavior according to POSIX, so O_EXEC would be a fine
choice. The spec goes on to state that O_SEARCH and O_EXEC need not be
distinct values, but they're not defined to be the same value.

This was pointed out as an incompatibility with other systems that had made
its way into libarchive, which had assumed that O_EXEC was an alias for
O_SEARCH.

This defines compatibility O_SEARCH/FSEARCH (equivalent to O_EXEC and FEXEC
respectively) and expands our UB for O_EXEC on a directory. O_EXEC on a
directory is checked in vn_open_vnode already, so for completeness we add a
NOEXECCHECK when O_SEARCH has been specified on the top-level fd and do not
re-check that when descending in namei.

[0] https://pubs.opengroup.org/onlinepubs/9699919799/

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D23247
2020-02-02 16:34:57 +00:00
Mateusz Guzik
45757984f8 vfs: consistently use size_t for buflen around VOP_VPTOCNP 2020-02-01 20:34:43 +00:00
Jeff Roberson
d6e13f3b4d Don't hold the object lock while calling getpages.
The vnode pager does not want the object lock held.  Moving this out allows
further object lock scope reduction in callers.  While here add some missing
paging in progress calls and an assert.  The object handle is now protected
explicitly with pip.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D23033
2020-01-19 23:47:32 +00:00
Mateusz Guzik
2a829749d3 tmpfs: add missing CLTFLAG_MPSAFE annotation 2020-01-15 01:32:11 +00:00