Commit Graph

130 Commits

Author SHA1 Message Date
Alan Somers
bfcb817bcd Fix issues with FUSE_ACCESS when default_permissions is disabled
This patch fixes two issues relating to FUSE_ACCESS when the
default_permissions mount option is disabled:

* VOP_ACCESS() calls with VADMIN set should never be sent to a fuse server
  in the form of FUSE_ACCESS operations. The FUSE protocol has no equivalent
  of VADMIN, so we must evaluate such things kernel-side, regardless of the
  default_permissions setting.

* The FUSE protocol only requires FUSE_ACCESS to be sent for two purposes:
  for the access(2) syscall and to check directory permissions for
  searchability during lookup. FreeBSD sends it much more frequently, due to
  differences between our VFS and Linux's, for which FUSE was designed. But
  this patch does eliminate several cases not required by the FUSE protocol:

  * for any FUSE_*XATTR operation
  * when creating a new file
  * when deleting a file
  * when setting timestamps, such as by utimensat(2).

* Additionally, when default_permissions is disabled, this patch removes one
  FUSE_GETATTR operation when deleting a file.

PR:		245689
Reported by:	MooseFS FreeBSD Team <freebsd@moosefs.pro>
Reviewed by:	cem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D24777
2020-05-22 18:11:17 +00:00
Alan Somers
b0ecfb42d1 fusefs: avoid cache corruption with buggy fuse servers
The FUSE protocol allows the client (kernel) to cache a file's size, if the
server (userspace daemon) allows it. A well-behaved daemon obviously should
not change a file's size while a client has it cached. But a buggy daemon
might. If the kernel ever detects that that has happened, then it should
invalidate the entire cache for that file. Previously, we would not only
cache stale data, but in the case of a file extension while we had the size
cached, we accidentally extended the cache with zeros.

PR:		244178
Reported by:	Ben RUBSON <ben.rubson@gmx.com>
Reviewed by:	cem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D24012
2020-03-11 04:29:45 +00:00
Kyle Evans
6a5abb1ee5 Provide O_SEARCH
O_SEARCH is defined by POSIX [0] to open a directory for searching, skipping
permissions checks on the directory itself after the initial open(). This is
close to the semantics we've historically applied for O_EXEC on a directory,
which is UB according to POSIX. Conveniently, O_SEARCH on a file is also
explicitly undefined behavior according to POSIX, so O_EXEC would be a fine
choice. The spec goes on to state that O_SEARCH and O_EXEC need not be
distinct values, but they're not defined to be the same value.

This was pointed out as an incompatibility with other systems that had made
its way into libarchive, which had assumed that O_EXEC was an alias for
O_SEARCH.

This defines compatibility O_SEARCH/FSEARCH (equivalent to O_EXEC and FEXEC
respectively) and expands our UB for O_EXEC on a directory. O_EXEC on a
directory is checked in vn_open_vnode already, so for completeness we add a
NOEXECCHECK when O_SEARCH has been specified on the top-level fd and do not
re-check that when descending in namei.

[0] https://pubs.opengroup.org/onlinepubs/9699919799/

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D23247
2020-02-02 16:34:57 +00:00
Mateusz Guzik
6fa079fc3f vfs: flatten vop vectors
This eliminates the following loop from all VOP calls:

while(vop != NULL && \
    vop->vop_spare2 == NULL && vop->vop_bypass == NULL)
        vop = vop->vop_default;

Reviewed by:	jeff
Tesetd by:	pho
Differential Revision:	https://reviews.freebsd.org/D22738
2019-12-16 00:06:22 +00:00
Alan Somers
42767f76af fusefs: fix some minor issues with fuse_vnode_setparent
* When unparenting a vnode, actually clear the flag. AFAIK this is basically
  a no-op because we only unparent a vnode when reclaiming it or when
  unlinking.

* There's no need to call fuse_vnode_setparent during reclaim, because we're
  about to free the vnode data anyway.

Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21630
2019-09-16 14:51:49 +00:00
Alan Somers
16f8783452 Coverity fixes in fusefs(5)
CID 1404532 fixes a signed vs unsigned comparison error in fuse_vnop_bmap.
It could potentially have resulted in VOP_BMAP reporting too many
consecutive blocks.

CID 1404364 is much worse. It was an array access by an untrusted,
user-provided variable. It could potentially have resulted in a malicious
file system crashing the kernel or worse.

Reported by:	Coverity
Reviewed by:	emaste
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21466
2019-09-06 19:40:11 +00:00
Mark Johnston
9222b82368 Remove unused VM page locking macros.
They were orphaned by r292373.

Reviewed by:	asomers
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21469
2019-08-29 22:13:15 +00:00
Konstantin Belousov
6470c8d3db Rework v_object lifecycle for vnodes.
Current implementation of vnode_create_vobject() and
vnode_destroy_vobject() is written so that it prepared to handle the
vm object destruction for live vnode.  Practically, no filesystems use
this, except for some remnants that were present in UFS till today.
One of the consequences of that model is that each filesystem must
call vnode_destroy_vobject() in VOP_RECLAIM() or earlier, as result
all of them get rid of the v_object in reclaim.

Move the call to vnode_destroy_vobject() to vgonel() before
VOP_RECLAIM().  This makes v_object stable: either the object is NULL,
or it is valid vm object till the vnode reclamation.  Remove code from
vnode_create_vobject() to handle races with the parallel destruction.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D21412
2019-08-29 07:50:25 +00:00
Alan Somers
5e63333052 fusefs: Fix some bugs regarding the size of the LISTXATTR list
* A small error in r338152 let to the returned size always being exactly
  eight bytes too large.

* The FUSE_LISTXATTR operation works like Linux's listxattr(2): if the
  caller does not provide enough space, then the server should return ERANGE
  rather than return a truncated list.  That's true even though in FUSE's
  case the kernel doesn't provide space to the client at all; it simply
  requests a maximum size for the list.  We previously weren't handling the
  case where the server returns ERANGE even though the kernel requested as
  much size as the server had told us it needs; that can happen due to a
  race.

* We also need to ensure that a pathological server that always returns
  ERANGE no matter what size we request in FUSE_LISTXATTR won't cause an
  infinite loop in the kernel.  As of this commit, it will instead cause an
  infinite loop that exits and enters the kernel on each iteration, allowing
  signals to be processed.

Reviewed by:	cem
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21287
2019-08-28 04:19:37 +00:00
Alan Somers
3a79e8e772 fusefs: don't send the namespace during listextattr
The FUSE_LISTXATTR operation always returns the full list of a file's
extended attributes, in all namespaces. There's no way to filter the list
server-side. However, currently FreeBSD's fusefs driver sends a namespace
string with the FUSE_LISTXATTR request. That behavior was probably copied
from fuse_vnop_getextattr, which has an attribute name argument. It's
been there ever since extended attribute support was added in r324620. This
commit removes it.

Reviewed by:	cem
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21280
2019-08-16 05:06:54 +00:00
Alan Somers
58df81b339 MFHead @350426
Sponsored by:	The FreeBSD Foundation
2019-07-30 04:17:36 +00:00
Mark Johnston
918988576c Avoid relying on header pollution from sys/refcount.h.
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2019-07-29 20:26:01 +00:00
Alan Somers
8aafc8c389 [skip ci] update copyright headers in fusefs files
Sponsored by:	The FreeBSD Foundation
2019-06-28 04:18:10 +00:00
Alan Somers
435ecf40bb fusefs: recycle vnodes after their last unlink
Previously fusefs would never recycle vnodes.  After VOP_INACTIVE, they'd
linger around until unmount or the vnlru reclaimed them.  This commit
essentially actives and inlines the old reclaim_revoked sysctl, and fixes
some issues dealing with the attribute cache and multiply linked files.

Sponsored by:	The FreeBSD Foundation
2019-06-27 20:18:12 +00:00
Alan Somers
560a55d094 fusefs: convert statistical sysctls to use counter(9)
counter(9) is more performant than using atomic instructions to update
sysctls that just report statistics to userland.

Sponsored by:	The FreeBSD Foundation
2019-06-27 16:30:25 +00:00
Alan Somers
caeea8b4cc fusefs: fix some memory leaks
Fix memory leaks relating to FUSE_BMAP and FUSE_CREATE.  There are still
leaks relating to FUSE_INTERRUPT, but they'll be harder to fix since the
server is legally allowed to never respond to a FUSE_INTERRUPT operation.

Sponsored by:	The FreeBSD Foundation
2019-06-27 00:00:48 +00:00
Alan Somers
b9e2019755 fusefs: rewrite vop_getpages and vop_putpages
Use the standard facilities for getpages and putpages instead of bespoke
implementations that don't work well with the writeback cache.  This has
several corollaries:

* Change the way we handle short reads _again_.  vfs_bio_getpages doesn't
  provide any way to handle unexpected short reads.  Plus, I found some more
  lock-order problems.  So now when the short read is detected we'll just
  clear the vnode's attribute cache, forcing the file size to be requeried
  the next time it's needed.  VOP_GETPAGES doesn't have any way to indicate
  a short read to the "caller", so we just bzero the rest of the page
  whenever a short read happens.

* Change the way we decide when to set the FUSE_WRITE_CACHE bit.  We now set
  it for clustered writes even when the writeback cache is not in use.

Sponsored by:   The FreeBSD Foundation
2019-06-25 17:24:43 +00:00
Alan Somers
a1c9f4ad0d fusefs: implement VOP_BMAP
If the fuse daemon supports FUSE_BMAP, then use that for the block mapping.
Otherwise, use the same technique used by vop_stdbmap.  Report large values
for runp and runb in order to maximize read clustering and minimize upcalls,
even if we don't know the true layout.

The major result of this change is that sequential reads to FUSE files will
now usually happen 128KB at a time instead of 64KB.

Sponsored by:	The FreeBSD Foundation
2019-06-20 17:08:21 +00:00
Alan Somers
0269ae4c19 MFHead @348740
Sponsored by:	The FreeBSD Foundation
2019-06-06 16:20:50 +00:00
Alan Somers
0d2bf48996 fusefs: check the vnode cache when looking up files for the NFS server
FUSE allows entries to be cached for a limited amount of time.  fusefs's
vnop_lookup method already implements that using the timeout functionality
of cache_lookup/cache_enter_time.  However, lookups for the NFS server go
through a separate path: vfs_vget.  That path can't use the same timeout
functionality because cache_lookup/cache_enter_time only work on pathnames,
whereas vfs_vget works by inode number.

This commit adds entry timeout information to the fuse vnode structure, and
checks it during vfs_vget.  This allows the NFS server to take advantage of
cached entries.  It's also the same path that FUSE's asynchronous cache
invalidation operations will use.

Sponsored by:	The FreeBSD Foundation
2019-05-31 21:22:58 +00:00
Alan Somers
a4856c96d0 fusefs: raise protocol level to 7.12
This commit raises the protocol level and adds backwards-compatibility code
to handle structure size changes.  It doesn't implement any new features.
The new features added in protocol 7.12 are:

* server-side umask processing (which FreeBSD won't do)
* asynchronous inode and directory entry invalidation (which I'll do next)

Sponsored by:	The FreeBSD Foundation
2019-05-29 16:39:52 +00:00
Alan Somers
e039bafa87 fusefs: add comments explaining why 7.11 features aren't implemented
Protocol 7.11 adds two new features, but neither of them were defined
correctly.  FUSE_IOCTL messages don't work for 32-bit daemons on a 64-bit
host (fixed in protocol 7.16).  FUSE_POLL is basically unusable until 7.21.
Before 7.21, the client can't choose which events to register for; the
client registers for "something" and the server replies to say which events
the client is registered for.  Also, before 7.21 there was no way for a
client to deregister a file handle.

Sponsored by:	The FreeBSD Foundation
2019-05-29 02:03:08 +00:00
Alan Somers
8aa24ed381 fusefs: flock(2) locks must be implemented in-kernel
If a FUSE file system sets the FUSE_POSIX_LOCKS flag then it can support
fcntl(2)-style locks directly.  However, the protocol does not adequately
support flock(2)-style locks until revision 7.17.  They must be implemented
locally in-kernel instead.  This unfortunately breaks the interoperability
of fcntl(2) and flock(2) locks for file systems that support the former.
C'est la vie.

Prior to this commit flock(2) would get sent to the server as a
fcntl(2)-style lock with the lock owner field set to stack garbage.

Sponsored by:	The FreeBSD Foundation
2019-05-28 00:03:46 +00:00
Alan Somers
65417f5e27 Remove "struct ucred*" argument from vtruncbuf
vtruncbuf takes a "struct ucred*" argument. AFAICT, it's been unused ever
since that function was first added in r34611. Remove it.  Also, remove some
"struct ucred" arguments from fuse and nfs functions that were only used by
vtruncbuf.

Reviewed by:	cem
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20377
2019-05-24 20:27:50 +00:00
Alan Somers
e76986fde0 fusefs: fix exporting fuse filesystems with nfsd
A previous commit made fuse exportable via userland NFS servers.
Compatibility with the in-kernel nfsd required two more changes:

* During read and write operations, implicitly do a FUSE_OPEN if there isn't
  already a valid file handle.  That's because nfsd never calls VOP_OPEN.
* During VOP_READDIR, if an implicit open was necessary, directory offsets
  from a previous VOP_READDIR may not be valid, so VOP_READDIR may have to
  start from the beginning and read until it encounters the requested
  offset.

I've done only limited testing over NFS, so there are probably still some
more bugs.  Thanks to rmacklem for all of the readdir changes, which he had
made for his pnfs work.

Sponsored by:	The FreeBSD Foundation
2019-05-23 23:06:26 +00:00
Alan Somers
e5b50fe736 fusefs: Make fuse file systems NFS-exportable
This commit adds the VOPs needed by userspace NFS servers (tested with
net/unfs3).  More work is needed to make the in-kernel nfsd work, because of
its stateless nature.  It doesn't open files prior to doing I/O.  Also, the
NFS-related VOPs currently ignore the entry cache.

Sponsored by:	The FreeBSD Foundation
2019-05-23 00:44:01 +00:00
Alan Somers
2013b723d3 fusefs: improve attribute cacheing
Consolidate all calls to fuse_vnode_setsize as a result of a file attribute
change to one location in fuse_internal_setattr.  There are still a few
calls elsewhere that happen as a result of a write.

Sponsored by:	The FreeBSD Foundation
2019-05-23 00:22:03 +00:00
Alan Somers
d311d6c467 fusefs: eliminate a superfluous fuse_node_setparent
Sponsored by:	The FreeBSD Foundation
2019-05-20 20:55:01 +00:00
Alan Somers
96192dfce0 fusefs: diff reduction vs the upstream sources
fuse_kernel.h defines the structures used by the FUSE protocol.  Originally
it came from libfuse, but the current source of truth is the Linux kernel.
This commit minimizes the diffs between our version and the Linux version as
of 21f3da95d (protocol version 7.8).

The flags field of struct fuse_listxattr_out and fuse_listxattr_in was an
error in our header.  Those fields don't exist in Linux or libfuse, and
they've never been used in FreeBSD.  In fact, those structs don't even exist
in Linux and libfuse; those projects confusingly overload the identical
fuse_getexattr_in and fuse_getxattr_out structs.

Sponsored by:	The FreeBSD Foundation
2019-05-15 22:51:25 +00:00
Alan Somers
3d15b234a4 fusefs: don't track a file's size in two places
fuse_vnode_data.filesize was mostly redundant with
fuse_vnode_data.cached_attrs.st_size, but didn't have exactly the same
meaning.  It was very confusing.  This commit eliminates the former.  It
also eliminates fuse_vnode_refreshsize, which ignored the cache timeout
value.

Sponsored by:	The FreeBSD Foundation
2019-05-15 00:38:52 +00:00
Alan Somers
5940f822ae fusefs: remove the vfs.fusefs.data_cache_invalidate sysctl
This sysctl was added > 6.5 years ago and I don't know why.  The description
seems at odds with the code.  While it's supposed to "discard clean cached
data" during VOP_INACTIVE, it looks like it would discard any cached data,
clean or otherwise.

Sponsored by:	The FreeBSD Foundation
2019-05-13 20:57:21 +00:00
Alan Somers
d5ff268834 fusefs: create sockets with FUSE_MKNOD, not FUSE_CREATE
libfuse expects sockets to be created with FUSE_MKNOD, not FUSE_CREATE,
because that's how Linux does it.  My first attempt at creating sockets
(r346894) used FUSE_CREATE because FreeBSD uses VOP_CREATE for this purpose.
There are no backwards-compatibility concerns with this change, because
socket support hasn't yet been merged to head.

Sponsored by:	The FreeBSD Foundation
2019-05-09 16:25:01 +00:00
Alan Somers
002e54b0aa fusefs: clear a dir's attr cache when its contents change
Any change to a directory's contents should cause its mtime and ctime to be
updated by the FUSE daemon.  Clear its attribute cache so we'll get the new
attributs the next time that they're needed.  This affects the following
VOPs: VOP_CREATE, VOP_LINK, VOP_MKDIR, VOP_MKNOD, VOP_REMOVE, VOP_RMDIR, and
VOP_SYMLINK

Reported by:	pjdfstest
Sponsored by:	The FreeBSD Foundation
2019-05-09 01:16:34 +00:00
Alan Somers
8e45ec4e64 fusefs: fix a permission handling bug during VOP_RENAME
If the file to be renamed is a directory and it's going to get a new parent,
then the user must have write permissions to that directory, because the
".." dirent must be changed.

Reported by:	pjdfstest
Sponsored by:	The FreeBSD Foundation
2019-05-08 22:28:13 +00:00
Alan Somers
d943c93e76 fusefs: allow non-owners to set timestamps to UTIME_NOW
utimensat should allow anybody with write access to set atime and mtime to
UTIME_NOW.

PR:		237181
Sponsored by:	The FreeBSD Foundation
2019-05-08 19:42:00 +00:00
Alan Somers
4ae3a56cb1 fusefs: updated cached attributes during VOP_LINK.
FUSE_LINK returns a new set of attributes.  fusefs should cache them just
like it does during other VOPs.  This is not only a matter of performance
but of correctness too; without caching the new attributes the vnode's nlink
value would be out-of-date.

Reported by:	pjdfstest
Sponsored by:	The FreeBSD Foundation
2019-05-08 18:12:38 +00:00
Alan Somers
a2bdd7379b fusefs: drop suid after a successful chown by a non-root user
Drop sgid too.  Also, drop them after a successful chgrp.

Reported by:	pjdfstest
Sponsored by:	The FreeBSD Foundation
2019-05-07 22:38:13 +00:00
Alan Somers
4e83d6555e fusefs: allow the null chown and null chgrp
Even an unprivileged user should be able to chown a file to its current
owner, or chgrp it to its current group.  Those are no-ops.

Reported by:	pjdfstest
Sponsored by:	The FreeBSD Foundation
2019-05-07 01:27:23 +00:00
Alan Somers
1c8a5f5e39 fusefs: disable posix_fallocate
fuse file systems have far too much variability for the standard
posix_fallocate implementation to work.  A future protocol revision (7.19)
adds a FUSE_FALLOCATE operation, but we don't support that yet.  Better to
simply return EINVAL until then.

Reported by:	pjdfstest
Sponsored by:	The FreeBSD Foundation
2019-05-07 00:03:05 +00:00
Alan Somers
3fa127896b fusefs: allow ftruncate on files without write permission
ftruncate should succeed as long as the file descriptor is writable, even if
the file doesn't have write permission.  This is important when combined
with O_CREAT.

Reported by:	pjdfstest
Sponsored by:	The FreeBSD Foundation
2019-05-06 20:46:58 +00:00
Alan Somers
8cfb44315a fusefs: Fix another obscure permission handling bug
Don't allow unprivileged users to set SGID on files to whose group they
don't belong.  This is slightly different than what POSIX says we should do
(clear sgid on return from a successful chmod), but it matches what UFS
currently does.

Reported by:	pjdfstest
Sponsored by:	The FreeBSD Foundation
2019-05-06 16:54:35 +00:00
Alan Somers
a90e32de25 fusefs: clear SUID & SGID after a successful write by a non-owner
Reported by:	pjdfstest
Sponsored by:	The FreeBSD Foundation
2019-05-06 16:17:55 +00:00
Alan Somers
ac0a68e9cd fusefs: don't allow truncating irregular files on an read-only mount
The readonly mount check had a special case allowing the sizes of files to
be changed if they weren't regular files.  I don't know why.  Neither UFS,
ZFS, nor ext2 have such a special case, and I don't know when you would ever
change the size of a non-regular file anyway.

Sponsored by:	The FreeBSD Foundation
2019-05-06 15:20:18 +00:00
Alan Somers
e5ff3a7e28 fusefs: only root may set the sticky bit on a non-directory
PR:		216391
Reported by:	pjdfstest
Sponsored by:	The FreeBSD Foundation
2019-05-04 16:27:58 +00:00
Alan Somers
93198e64fb fusefs: fix a memory leak from r346979
PR:		216391
Sponsored by:	The FreeBSD Foundation
2019-05-01 17:24:53 +00:00
Alan Somers
474ba6fa3b fusefs: fix some permission checks with -o default_permissions
When mounted with -o default_permissions fusefs is supposed to validate all
permissions in the kernel, not the file system.  This commit fixes two
permissions that I had previously overlooked.

* Only root may chown a file
* Non-root users may only chgrp a file to a group to which they belong

PR:		216391
Sponsored by:	The FreeBSD Foundation
2019-05-01 00:00:49 +00:00
Alan Somers
ede571e40a fusefs: support unix-domain sockets
Also, fix the teardown of the Fifo.read_write test

Sponsored by:	The FreeBSD Foundation
2019-04-29 16:24:51 +00:00
Alan Somers
f9b0e30ba7 fusefs: FIFO support
Sponsored by:	The FreeBSD Foundation
2019-04-29 01:40:35 +00:00
Alan Somers
9c7ec33162 fusefs: fix a deadlock in VOP_PUTPAGES
As of r346162 fuse now invalidates the cache during writes.  But it can't do
that when writing from VOP_PUTPAGES, because the write is coming _from_ the
cache.  Trying to invalidate the cache in that situation causes a deadlock
in vm_object_page_remove, because the pages in question have already been
busied by the same thread.

PR:		235774
Sponsored by:	The FreeBSD Foundation
2019-04-26 19:47:43 +00:00
Alan Somers
419e7ff674 fusefs: rename the SDT probes from "fuse" to "fusefs"
This matches the new name of the kld.

Sponsored by:	The FreeBSD Foundation
2019-04-20 00:04:31 +00:00