freebsd-skq

Author	SHA1	Message	Date
Mateusz Guzik	388820fbef	fusefs: add missing CLTFLAG_MPSAFE annotation	2020-01-15 01:31:28 +00:00
Mateusz Guzik	b249ce48ea	vfs: drop the mostly unused flags argument from VOP_UNLOCK Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427	2020-01-03 22:29:58 +00:00
Mateusz Guzik	6fa079fc3f	vfs: flatten vop vectors This eliminates the following loop from all VOP calls: while(vop != NULL && \ vop->vop_spare2 == NULL && vop->vop_bypass == NULL) vop = vop->vop_default; Reviewed by: jeff Tesetd by: pho Differential Revision: https://reviews.freebsd.org/D22738	2019-12-16 00:06:22 +00:00
Mateusz Guzik	abd80ddb94	vfs: introduce v_irflag and make v_type smaller The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time. v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED. Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715	2019-12-08 21:30:04 +00:00
Alan Somers	320c848ff6	Fix an off-by-one error from r351961 That revision addressed a Coverity CID that could lead to a buffer overflow, but it had an off-by-one error in the buffer size check. Reported by: Coverity Coverity CID: 1405530 MFC after: 3 days MFC-With: 351961 Sponsored by: The FreeBSD Foundation	2019-09-16 16:41:01 +00:00
Alan Somers	42767f76af	fusefs: fix some minor issues with fuse_vnode_setparent * When unparenting a vnode, actually clear the flag. AFAIK this is basically a no-op because we only unparent a vnode when reclaiming it or when unlinking. * There's no need to call fuse_vnode_setparent during reclaim, because we're about to free the vnode data anyway. Reviewed by: emaste MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21630	2019-09-16 14:51:49 +00:00
Alan Somers	6c0c362075	fusefs: Fix iosize for FUSE_WRITE in 7.8 compat mode When communicating with a FUSE server that implements version 7.8 (or older) of the FUSE protocol, the FUSE_WRITE request structure is 16 bytes shorter than normal. The protocol version check wasn't applied universally, leading to an extra 16 bytes being sent to such servers. The extra bytes were allocated and bzero()d, so there was no information disclosure. Reviewed by: emaste MFC after: 3 days MFC-With: r350665 Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21557	2019-09-11 19:29:40 +00:00
Alan Somers	16f8783452	Coverity fixes in fusefs(5) CID 1404532 fixes a signed vs unsigned comparison error in fuse_vnop_bmap. It could potentially have resulted in VOP_BMAP reporting too many consecutive blocks. CID 1404364 is much worse. It was an array access by an untrusted, user-provided variable. It could potentially have resulted in a malicious file system crashing the kernel or worse. Reported by: Coverity Reviewed by: emaste MFC after: 3 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21466	2019-09-06 19:40:11 +00:00
Mark Johnston	9222b82368	Remove unused VM page locking macros. They were orphaned by r292373. Reviewed by: asomers MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D21469	2019-08-29 22:13:15 +00:00
Konstantin Belousov	6470c8d3db	Rework v_object lifecycle for vnodes. Current implementation of vnode_create_vobject() and vnode_destroy_vobject() is written so that it prepared to handle the vm object destruction for live vnode. Practically, no filesystems use this, except for some remnants that were present in UFS till today. One of the consequences of that model is that each filesystem must call vnode_destroy_vobject() in VOP_RECLAIM() or earlier, as result all of them get rid of the v_object in reclaim. Move the call to vnode_destroy_vobject() to vgonel() before VOP_RECLAIM(). This makes v_object stable: either the object is NULL, or it is valid vm object till the vnode reclamation. Remove code from vnode_create_vobject() to handle races with the parallel destruction. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D21412	2019-08-29 07:50:25 +00:00
Alan Somers	5e63333052	fusefs: Fix some bugs regarding the size of the LISTXATTR list * A small error in r338152 let to the returned size always being exactly eight bytes too large. * The FUSE_LISTXATTR operation works like Linux's listxattr(2): if the caller does not provide enough space, then the server should return ERANGE rather than return a truncated list. That's true even though in FUSE's case the kernel doesn't provide space to the client at all; it simply requests a maximum size for the list. We previously weren't handling the case where the server returns ERANGE even though the kernel requested as much size as the server had told us it needs; that can happen due to a race. * We also need to ensure that a pathological server that always returns ERANGE no matter what size we request in FUSE_LISTXATTR won't cause an infinite loop in the kernel. As of this commit, it will instead cause an infinite loop that exits and enters the kernel on each iteration, allowing signals to be processed. Reviewed by: cem MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21287	2019-08-28 04:19:37 +00:00
Alan Somers	3a79e8e772	fusefs: don't send the namespace during listextattr The FUSE_LISTXATTR operation always returns the full list of a file's extended attributes, in all namespaces. There's no way to filter the list server-side. However, currently FreeBSD's fusefs driver sends a namespace string with the FUSE_LISTXATTR request. That behavior was probably copied from fuse_vnop_getextattr, which has an attribute name argument. It's been there ever since extended attribute support was added in r324620. This commit removes it. Reviewed by: cem MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21280	2019-08-16 05:06:54 +00:00
Alan Somers	bf50749773	fusefs: Fix the size of fuse_getattr_in In FUSE protocol 7.9, the size of the FUSE_GETATTR request has increased. However, the fusefs driver is currently not sending the additional fields. In our implementation, the additional fields are always zero, so I there haven't been any test failures until now. But fusefs-lkl requires the request's length to be correct. Fix this bug, and also enhance the test suite to catch similar bugs. PR: 239830 MFC after: 2 weeks MFC-With: 350665 Sponsored by: The FreeBSD Foundation	2019-08-14 20:45:00 +00:00
Alan Somers	427d205cb5	fusefs: remove superfluous counter_u64_zero Reported by: glebius Sponsored by: The FreeBSD Foundation	2019-08-06 00:50:25 +00:00
Alan Somers	508abc9494	fusefs: fix the build after r350446 fuse needs to include an additional header after r350446 Sponsored by: The FreeBSD Foundation	2019-07-31 21:48:35 +00:00
Alan Somers	58df81b339	MFHead @350426 Sponsored by: The FreeBSD Foundation	2019-07-30 04:17:36 +00:00
Mark Johnston	918988576c	Avoid relying on header pollution from sys/refcount.h. MFC after: 3 days Sponsored by: The FreeBSD Foundation	2019-07-29 20:26:01 +00:00
Alan Somers	669a092af1	fusefs: fix panic when writing with O_DIRECT and using writeback cache When a fusefs file system is mounted using the writeback cache, the cache may still be bypassed by opening a file with O_DIRECT. When writing with O_DIRECT, the cache must be invalidated for the affected portion of the file. Fix some panics caused by inadvertently invalidating too much. Sponsored by: The FreeBSD Foundation	2019-07-28 15:17:32 +00:00
Alan Somers	ed74f781c9	fusefs: add a intr/nointr mount option FUSE file systems can optionally support interrupting outstanding operations. However, the file system does not identify to the kernel at mount time whether it's capable of doing that. Instead it signals its noncapability by returning ENOSYS to the first FUSE_INTERRUPT operation it receives. That's a problem for reliable signal delivery, because the kernel must choose which thread should get a signal before it knows whether the FUSE server can handle interrupts. The problem is even worse because the FUSE protocol allows a file system to simply ignore all FUSE_INTERRUPT operations. Fix the signal delivery logic by making interruptibility an opt-in mount option. This will require a corresponding change to libfuse, but not to most file systems that link to libfuse. Bump __FreeBSD_version due to the new mount option. Sponsored by: The FreeBSD Foundation	2019-07-18 17:55:13 +00:00
Alan Somers	f05962453e	fusefs: fix another semi-infinite loop bug regarding signal handling fticket_wait_answer would spin if it received an unhandled signal whose default disposition is to terminate. The reason is because msleep(9) would return EINTR even for a masked signal. One reason is when the thread is stopped, which happens for example during sigexit(). Fix this bug by returning immediately if fticket_wait_answer ever gets interrupted a second time, for any reason. Sponsored by: The FreeBSD Foundation	2019-07-18 15:30:00 +00:00
Alan Somers	d26d63a4af	fusefs: multiple interruptility improvements 1) Don't explicitly not mask SIGKILL. kern_sigprocmask won't allow it to be masked, anyway. 2) Fix an infinite loop bug. If a process received both a maskable signal lower than 9 (like SIGINT) and then received SIGKILL, fticket_wait_answer would spin. msleep would immediately return EINTR, but cursig would return SIGINT, so the sleep would get retried. Fix it by explicitly checking whether SIGKILL has been received. 3) Abandon the sig_isfatal optimization introduced by r346357. That optimization would cause fticket_wait_answer to return immediately, without waiting for a response from the server, if the process were going to exit anyway. However, it's vulnerable to a race: 1) fatal signal is received while fticket_wait_answer is sleeping. 2) fticket_wait_answer sends the FUSE_INTERRUPT operation. 3) fticket_wait_answer determines that the signal was fatal and returns without waiting for a response. 4) Another thread changes the signal to non-fatal. 5) The first thread returns to userspace. Instead of exiting, the process continues. 6) The application receives EINTR, wrongly believes that the operation was successfully interrupted, and restarts it. This could cause problems for non-idempotent operations like FUSE_RENAME. Reported by: kib (the race part) Sponsored by: The FreeBSD Foundation	2019-07-17 22:45:43 +00:00
Alan Somers	07e86257e6	fusefs: fix the build with some NODEBUG kernels systm.h needs to be included before counter.h Sponsored by: The FreeBSD Foundation	2019-07-13 21:41:12 +00:00
Alan Somers	97b0512b23	projects/fuse2: build fixes * Fix the kernel build with gcc by removing a redundant extern declaration * In the tests, fix a printf format specifier that assumed LP64 Sponsored by: The FreeBSD Foundation	2019-07-13 14:42:09 +00:00
Alan Somers	7e1f5432f4	fusefs: don't leak memory of unsent operations on unmount Sponsored by: The FreeBSD Foundation	2019-06-28 18:48:02 +00:00
Alan Somers	8aafc8c389	[skip ci] update copyright headers in fusefs files Sponsored by: The FreeBSD Foundation	2019-06-28 04:18:10 +00:00
Alan Somers	c1afff113c	fusefs: fix a memory leak regarding FUSE_INTERRUPT We were leaking the fuse ticket if the original operation completed before the daemon received the INTERRUPT operation. Fixing this was easier than I expected. Sponsored by: The FreeBSD Foundation	2019-06-27 22:24:56 +00:00
Alan Somers	435ecf40bb	fusefs: recycle vnodes after their last unlink Previously fusefs would never recycle vnodes. After VOP_INACTIVE, they'd linger around until unmount or the vnlru reclaimed them. This commit essentially actives and inlines the old reclaim_revoked sysctl, and fixes some issues dealing with the attribute cache and multiply linked files. Sponsored by: The FreeBSD Foundation	2019-06-27 20:18:12 +00:00
Alan Somers	38c8634635	fusefs: counter(9) variables should not be statically initialized Reported by: rpokala Sponsored by: The FreeBSD Foundation	2019-06-27 17:59:15 +00:00
Alan Somers	560a55d094	fusefs: convert statistical sysctls to use counter(9) counter(9) is more performant than using atomic instructions to update sysctls that just report statistics to userland. Sponsored by: The FreeBSD Foundation	2019-06-27 16:30:25 +00:00
Alan Somers	caeea8b4cc	fusefs: fix some memory leaks Fix memory leaks relating to FUSE_BMAP and FUSE_CREATE. There are still leaks relating to FUSE_INTERRUPT, but they'll be harder to fix since the server is legally allowed to never respond to a FUSE_INTERRUPT operation. Sponsored by: The FreeBSD Foundation	2019-06-27 00:00:48 +00:00
Alan Somers	f8ebf1cd7e	fusefs: implement protocol 7.23's FUSE_WRITEBACK_CACHE option As of protocol 7.23, fuse file systems can specify their cache behavior on a per-mountpoint basis. If they set FUSE_WRITEBACK_CACHE in fuse_init_out.flags, then they'll get the writeback cache. If not, then they'll get the writethrough cache. If they set FOPEN_DIRECT_IO in every FUSE_OPEN response, then they'll get no cache at all. The old vfs.fusefs.data_cache_mode sysctl is ignored for servers that use protocol 7.23 or later. However, it's retained for older servers, especially for those running in jails that lack access to the new protocol. This commit also fixes two other minor test bugs: * WriteCluster:SetUp was using an uninitialized variable. * Read.direct_io_pread wasn't verifying that the cache was actually bypassed. Sponsored by: The FreeBSD Foundation	2019-06-26 17:32:31 +00:00
Alan Somers	205696a17d	fusefs: delete some unused mount options The fusefs kernel module allegedly supported no_attrcache, no_readahed, no_datacache, no_namecache, and no_mmap mount options, but the mount_fusefs binary never did. So there was no way to ever activate these options. Delete them. Some of them have alternatives: no_attrcache: set the attr_valid time to 0 in FUSE_LOOKUP and FUSE_GETATTR responses. no_readahed: set max_readahead to 0 in the FUSE_INIT response. no_datacache: set the vfs.fusefs.data_cache_mode sysctl to 0, or (coming soon) set the attr_valid time to 0 and set FUSE_AUTO_INVAL_DATA in the FUSE_INIT response. no_namecache: set entry_valid time to 0 in FUSE_LOOKUP and FUSE_GETATTR responses. Sponsored by: The FreeBSD Foundation	2019-06-26 15:15:24 +00:00
Alan Somers	fef464546c	fusefs: implement the "time_gran" feature. If a server supports a timestamp granularity other than 1ns, it can tell the client this as of protocol 7.23. The client will use that granularity when updating its cached timestamps during write. This way the timestamps won't appear to change following flush. Sponsored by: The FreeBSD Foundation	2019-06-26 02:09:22 +00:00
Alan Somers	0a8fe2d369	fusefs: set ctime during FUSE_SETATTR following a write As of r349396 the kernel will internally update the mtime and ctime of files on write. It will also flush the mtime should a SETATTR happen before the data cache gets flushed. Now it will flush the ctime too, if the server is using protocol 7.23 or higher. This is the only case in which the kernel will explicitly set a file's ctime, since neither utimensat(2) nor any other user interfaces allow it. Sponsored by: The FreeBSD Foundation	2019-06-26 00:03:37 +00:00
Alan Somers	788af9538a	fusefs: automatically update mtime and ctime on write Writing should implicitly update a file's mtime and ctime. For fuse, the server is supposed to do that. But the client needs to do it too, because the FUSE_WRITE response does not include time attributes, and it's not desirable to issue a GETATTR after every WRITE. When using the writeback cache, there's another hitch: the kernel should ignore the mtime and ctime fields in any GETATTR response for files with a dirty write cache. Sponsored by: The FreeBSD Foundation	2019-06-25 23:40:18 +00:00
Alan Somers	0d3a88d76c	fusefs: writes should update the file size, even when data_cache_mode=0 Writes that extend a file should update the file's size. r344185 restricted that behavior for fusefs to only happen when the data cache was enabled. That probably made sense at the time because the attribute cache wasn't fully baked yet. Now that it is, we should always update the cached file size during write. Sponsored by: The FreeBSD Foundation	2019-06-25 18:36:11 +00:00
Alan Somers	b9e2019755	fusefs: rewrite vop_getpages and vop_putpages Use the standard facilities for getpages and putpages instead of bespoke implementations that don't work well with the writeback cache. This has several corollaries: * Change the way we handle short reads _again_. vfs_bio_getpages doesn't provide any way to handle unexpected short reads. Plus, I found some more lock-order problems. So now when the short read is detected we'll just clear the vnode's attribute cache, forcing the file size to be requeried the next time it's needed. VOP_GETPAGES doesn't have any way to indicate a short read to the "caller", so we just bzero the rest of the page whenever a short read happens. * Change the way we decide when to set the FUSE_WRITE_CACHE bit. We now set it for clustered writes even when the writeback cache is not in use. Sponsored by: The FreeBSD Foundation	2019-06-25 17:24:43 +00:00
Alan Somers	1734e205f3	fusefs: refine the short read fix from r349332 b_fsprivate1 needs to be initialized even for write operations, probably because a buffer can be used to read, write, and read again with the final read serviced by cache. Sponsored by: The FreeBSD Foundation	2019-06-24 20:08:28 +00:00
Alan Somers	17575bad85	fusefs: improve the short read fix from r349279 VOP_GETPAGES intentionally tries to read beyond EOF, so fuse_read_biobackend can't rely on bp->b_resid > 0 indicating a short read. And adjusting bp->b_count after a short read seems to cause some sort of resource leak. Instead, store the shortfall in the bp->b_fsprivate1 field. Sponsored by: The FreeBSD Foundation	2019-06-24 17:05:31 +00:00
Alan Somers	44f654fdc5	fusefs: fix corruption on short reads caused by r349279 Even if a short read is caused by EOF, it's still necessary to bzero the remaining buffer, because that buffer could become valid as a result of a future ftruncate or pwrite operation. Reported by: fsx Sponsored by: The FreeBSD Foundation	2019-06-21 23:29:29 +00:00
Alan Somers	aef22f2d75	fusefs: correctly handle short reads A fuse server may return a short read for three reasons: * The file is opened with FOPEN_DIRECT_IO. In this case, the short read should be returned directly to userland. We already handled this case correctly. * The file was truncated server-side, and the read hit EOF. In this case, the kernel should update the file size. Fixed in the case of VOP_READ. Fixing this for VOP_GETPAGES is TODO. * The file is opened in writeback mode, there are dirty buffers past what the server thinks is the file's EOF, and the read hit what the server thinks is the file's EOF. In this case, the client is trying to read a hole, and should zero-fill it. We already handled this case, and I added a test for it. Sponsored by: The FreeBSD Foundation	2019-06-21 21:44:31 +00:00
Alan Somers	87ff949a7b	fusefs: raise protocol level to 7.23 None of the new features are implemented yet. This commit just adds the new protocol definitions and adds backwards-compatibility code for pre 7.23 servers. Sponsored by: The FreeBSD Foundation	2019-06-21 04:57:23 +00:00
Alan Somers	8f9b3ba718	fusefs: use standard integer types in fuse_kernel.h This is a merge of Linux revision 4c82456eeb4da081dd63dc69e91aa6deabd29e03. No functional change. Sponsored by: The FreeBSD Foundation	2019-06-21 03:17:27 +00:00
Alan Somers	b160acd1c0	fusefs: raise the protocol level to 7.21 Jumping from protocol 7.15 to 7.21 adds several new features. While they're all potentially useful, they're also all optional, and I'm not implementing any right now because my highest priority lies in a later version. Sponsored by: The FreeBSD Foundation	2019-06-21 03:04:56 +00:00
Alan Somers	ecb489158c	fusefs: diff reduction of fuse_kernel.h vs the upstream version fuse_kernel.h is based on Linux's fuse.h. In r349250 I modified fuse_kernel.h by generating a diff of two versions of Linux's fuse.h and applying it to our tree. patch succeeded, but it put one chunk in the wrong location. This commit fixes that. No functional changes. Sponsored by: The FreeBSD Foundation	2019-06-21 02:55:43 +00:00
Alan Somers	7cbb8e8a06	fusefs: raise protocol level to 7.15 This protocol level adds two new features: the ability for the server to store or retrieve data into/from the client's cache. But the messages aren't defined soundly since they identify the file only by its inode, without the generation number. So it's possible for them to modify the wrong file's cache. Also, I don't know of any file systems in ports that use these messages. So I'm not implementing them. I did add a (disabled) test for the store message, however. Sponsored by: The FreeBSD Foundation	2019-06-20 23:32:25 +00:00
Alan Somers	bb23d43901	fusefs: trivially raise protocol level to 7.14 The only new feature is splice(2) support on /dev/fuse, which FreeBSD can't support. Sponsored by: The FreeBSD Foundation	2019-06-20 23:12:19 +00:00
Alan Somers	192a918194	fusefs: attempt to support servers as old as protocol 7.4 Previously we allowed servers as old as 7.1 to connect (there never was a 7.0). However, we wrongly assumed a few things about protocols older than 7.8. This commit attempts to support servers as old as 7.4 but no older. I added no new tests because I'm not sure there actually _are_ any servers this old in the wild. Sponsored by: The FreeBSD Foundation	2019-06-20 22:21:42 +00:00
Alan Somers	2ffddc5ee9	fusefs: raise protocol level to 7.13 This protocol version adds one new feature: the ability for the server to set the maximum number of background requests and a "congestion threshold" with ill-defined properties. I don't know of any fuse file systems in ports that use this feature, so I'm not implementing it. Sponsored by: The FreeBSD Foundation	2019-06-20 21:29:28 +00:00
Alan Somers	a1c9f4ad0d	fusefs: implement VOP_BMAP If the fuse daemon supports FUSE_BMAP, then use that for the block mapping. Otherwise, use the same technique used by vop_stdbmap. Report large values for runp and runb in order to maximize read clustering and minimize upcalls, even if we don't know the true layout. The major result of this change is that sequential reads to FUSE files will now usually happen 128KB at a time instead of 64KB. Sponsored by: The FreeBSD Foundation	2019-06-20 17:08:21 +00:00

1 2 3 4 5 ...

268 Commits