Commit Graph

772 Commits

Author SHA1 Message Date
Alan Somers
c2265ae7a8 fusefs: skip some tests when unsafe aio is disabled
MFC after:	16 days
MFC-With:	r350665
Sponsored by:	The FreeBSD Foundation
2019-08-12 20:00:21 +00:00
Alan Somers
9f13765e42 fusefs: fix building tests with GCC 8
GCC 8 objected to including C++-only flags in CWARNFLAGS

Sponsored by:	The FreeBSD Foundation
2019-07-30 19:47:45 +00:00
Alan Somers
f0c07f0ce8 fusefs: nul-terminate some strings in the readdir test
Reported by:	GCC 8
Sponsored by:	The FreeBSD Foundation
2019-07-30 17:31:09 +00:00
Alan Somers
669a092af1 fusefs: fix panic when writing with O_DIRECT and using writeback cache
When a fusefs file system is mounted using the writeback cache, the cache
may still be bypassed by opening a file with O_DIRECT.  When writing with
O_DIRECT, the cache must be invalidated for the affected portion of the
file.  Fix some panics caused by inadvertently invalidating too much.

Sponsored by:	The FreeBSD Foundation
2019-07-28 15:17:32 +00:00
Alan Somers
a63915c2d7 MFHead @r350386
Sponsored by:	The FreeBSD Foundation
2019-07-28 04:02:22 +00:00
Li-Wen Hsu
1ab93d1f23 Temporarily skip flakey test case
sys.kern.ptrace_test.ptrace__follow_fork_parent_detached_unrelated_debugger

PR:		239425
Sponsored by:	The FreeBSD Foundation
2019-07-24 17:41:40 +00:00
Li-Wen Hsu
c2dc497a38 Temporarily skip flakey test case
sys.kern.ptrace_test.ptrace__parent_sees_exit_after_child_debugger

PR:		239399
Sponsored by:	The FreeBSD Foundation
2019-07-23 09:39:27 +00:00
Li-Wen Hsu
ea24861d5e Temporarily skip flakey test case
sys.kern.ptrace_test.ptrace__follow_fork_both_attached_unrelated_debugger

PR:		239397
Sponsored by:	The FreeBSD Foundation
2019-07-23 09:19:58 +00:00
Li-Wen Hsu
7d1f74716c Temporarily skip flakey test case
sys.kern.ptrace_test.ptrace__PT_KILL_competing_stop

PR:		220841
Sponsored by:	The FreeBSD Foundation
2019-07-23 07:56:42 +00:00
Li-Wen Hsu
4f6d74c9c4 Temporarily skip sys.netpfil.pf.forward.{v4,v6} and sys.netpfil.pf.set_tos.v4
on i386 as they are flakey on it

PR:		239380
Sponsored by:	The FreeBSD Foundation
2019-07-22 18:54:26 +00:00
Li-Wen Hsu
63b0609c12 Fix URL.
Sponsored by:	The FreeBSD Foundation
2019-07-22 18:43:46 +00:00
Li-Wen Hsu
37ba9b348b Temporarily skip flakey test case
sys.kern.ptrace_test.ptrace__follow_fork_child_detached_unrelated_debugger

PR:		239292
Sponsored by:	The FreeBSD Foundation
2019-07-22 10:37:56 +00:00
Alan Somers
5a0b9a2776 fusefs: fix warnings in the tests reported by GCC
Sponsored by:	The FreeBSD Foundation
2019-07-20 05:21:13 +00:00
Alan Somers
fca79580be sendfile: don't panic when VOP_GETPAGES_ASYNC returns an error
PR:		236466
Sponsored by:	The FreeBSD Foundation
2019-07-19 18:03:30 +00:00
Alan Somers
ed74f781c9 fusefs: add a intr/nointr mount option
FUSE file systems can optionally support interrupting outstanding
operations.  However, the file system does not identify to the kernel at
mount time whether it's capable of doing that.  Instead it signals its
noncapability by returning ENOSYS to the first FUSE_INTERRUPT operation it
receives.  That's a problem for reliable signal delivery, because the kernel
must choose which thread should get a signal before it knows whether the
FUSE server can handle interrupts.  The problem is even worse because the
FUSE protocol allows a file system to simply ignore all FUSE_INTERRUPT
operations.

Fix the signal delivery logic by making interruptibility an opt-in mount
option.  This will require a corresponding change to libfuse, but not to
most file systems that link to libfuse.

Bump __FreeBSD_version due to the new mount option.

Sponsored by:	The FreeBSD Foundation
2019-07-18 17:55:13 +00:00
Alan Somers
d26d63a4af fusefs: multiple interruptility improvements
1) Don't explicitly not mask SIGKILL.  kern_sigprocmask won't allow it to be
   masked, anyway.

2) Fix an infinite loop bug.  If a process received both a maskable signal
   lower than 9 (like SIGINT) and then received SIGKILL,
   fticket_wait_answer would spin.  msleep would immediately return EINTR,
   but cursig would return SIGINT, so the sleep would get retried.  Fix it
   by explicitly checking whether SIGKILL has been received.

3) Abandon the sig_isfatal optimization introduced by r346357.  That
   optimization would cause fticket_wait_answer to return immediately,
   without waiting for a response from the server, if the process were going
   to exit anyway.  However, it's vulnerable to a race:

   1) fatal signal is received while fticket_wait_answer is sleeping.
   2) fticket_wait_answer sends the FUSE_INTERRUPT operation.
   3) fticket_wait_answer determines that the signal was fatal and returns
      without waiting for a response.
   4) Another thread changes the signal to non-fatal.
   5) The first thread returns to userspace.  Instead of exiting, the
      process continues.
   6) The application receives EINTR, wrongly believes that the operation
      was successfully interrupted, and restarts it.  This could cause
      problems for non-idempotent operations like FUSE_RENAME.

Reported by:    kib (the race part)
Sponsored by:   The FreeBSD Foundation
2019-07-17 22:45:43 +00:00
John Baldwin
32451fb9fc Add ptrace op PT_GET_SC_RET.
This ptrace operation returns a structure containing the error and
return values from the current system call.  It is only valid when a
thread is stopped during a system call exit (PL_FLAG_SCX is set).

The sr_error member holds the error value from the system call.  Note
that this error value is the native FreeBSD error value that has _not_
been translated to an ABI-specific error value similar to the values
logged to ktrace.

If sr_error is zero, then the return values of the system call will be
set in sr_retval[0] and sr_retval[1].

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D20901
2019-07-15 21:48:02 +00:00
John Baldwin
c8ea87310c Add a test for PT_GET_SC_ARGS.
Reviewed by:	kib
MFC after:	1 month
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D20899
2019-07-15 21:26:55 +00:00
Alan Somers
97b0512b23 projects/fuse2: build fixes
* Fix the kernel build with gcc by removing a redundant extern declaration
* In the tests, fix a printf format specifier that assumed LP64

Sponsored by:	The FreeBSD Foundation
2019-07-13 14:42:09 +00:00
Li-Wen Hsu
1db8307b66 Correct definitions in sys.opencrypto.runtests.main for 32bit platform
Reviewed by:	cem, jhb
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20894
2019-07-10 01:08:08 +00:00
Li-Wen Hsu
43527153dc Skip sys.netpfil.pf.names.names and sys.netpfil.pf.synproxy.synproxy
temporarily because kernel panics when flushing epair queue.

PR:		238870
Sponsored by:	The FreeBSD Foundation
2019-06-29 12:19:57 +00:00
Alan Somers
7e1f5432f4 fusefs: don't leak memory of unsent operations on unmount
Sponsored by:	The FreeBSD Foundation
2019-06-28 18:48:02 +00:00
Alan Somers
7f49ce7a0b MFHead @349476
Sponsored by:	The FreeBSD Foundation
2019-06-27 23:50:54 +00:00
Alan Somers
435ecf40bb fusefs: recycle vnodes after their last unlink
Previously fusefs would never recycle vnodes.  After VOP_INACTIVE, they'd
linger around until unmount or the vnlru reclaimed them.  This commit
essentially actives and inlines the old reclaim_revoked sysctl, and fixes
some issues dealing with the attribute cache and multiply linked files.

Sponsored by:	The FreeBSD Foundation
2019-06-27 20:18:12 +00:00
Alan Somers
9cf5812603 fusefs: fix a memory leak in the forget test
Sponsored by:	The FreeBSD Foundation
2019-06-27 17:44:21 +00:00
Alan Somers
f74b33d9db fusefs: tighten expectations in mmap tests
In r349378 I fixed mmap's habit of reading more data than was available.

Sponsored by:	The FreeBSD Foundation
2019-06-26 23:10:20 +00:00
Alan Somers
7fc0921d7e fusefs: annotate deliberate file descriptor leaks in the tests
closing a file descriptor causes FUSE activity that is superfluous to the
purpose of most tests, but would nonetheless require matching expectations.
Rather than do that, most tests deliberately leak file descriptors instead.
This commit moves the leakage from each test into two trivial functions:
leak and leakdir.  Hopefully Coverity will only complain about those
functions and not all of their callers.

Sponsored by:	The FreeBSD Foundation
2019-06-26 20:25:57 +00:00
Alan Somers
c51f519b36 fusefs: run the io tests with direct io, too
Now the io tests are run in all cache modes.  The fusefs test suite can now
get adequate coverage without changing the value of
vfs.fusefs.data_cache_mode, which is only needed for legacy file systems
now.

Sponsored by:	The FreeBSD Foundation
2019-06-26 19:10:39 +00:00
Alan Somers
f8ebf1cd7e fusefs: implement protocol 7.23's FUSE_WRITEBACK_CACHE option
As of protocol 7.23, fuse file systems can specify their cache behavior on a
per-mountpoint basis.  If they set FUSE_WRITEBACK_CACHE in
fuse_init_out.flags, then they'll get the writeback cache.  If not, then
they'll get the writethrough cache.  If they set FOPEN_DIRECT_IO in every
FUSE_OPEN response, then they'll get no cache at all.

The old vfs.fusefs.data_cache_mode sysctl is ignored for servers that use
protocol 7.23 or later.  However, it's retained for older servers,
especially for those running in jails that lack access to the new protocol.

This commit also fixes two other minor test bugs:
* WriteCluster:SetUp was using an uninitialized variable.
* Read.direct_io_pread wasn't verifying that the cache was actually
  bypassed.

Sponsored by:	The FreeBSD Foundation
2019-06-26 17:32:31 +00:00
Alan Somers
fef464546c fusefs: implement the "time_gran" feature.
If a server supports a timestamp granularity other than 1ns, it can tell the
client this as of protocol 7.23.  The client will use that granularity when
updating its cached timestamps during write.  This way the timestamps won't
appear to change following flush.

Sponsored by:	The FreeBSD Foundation
2019-06-26 02:09:22 +00:00
Alan Somers
6efc53b9f3 fusefs: delete obsolete comments in the tests
I originally thought that the kernel would be responsible for ctime in
protocol 7.23.  But now I realize that's not the case.  The server is
responsible for ctime.  The kernel only sets it when there are dirty writes
cached, because that's when the server can't.

Sponsored by:	The FreeBSD Foundation
2019-06-26 00:06:41 +00:00
Alan Somers
0a8fe2d369 fusefs: set ctime during FUSE_SETATTR following a write
As of r349396 the kernel will internally update the mtime and ctime of files
on write.  It will also flush the mtime should a SETATTR happen before the
data cache gets flushed.  Now it will flush the ctime too, if the server is
using protocol 7.23 or higher.

This is the only case in which the kernel will explicitly set a file's
ctime, since neither utimensat(2) nor any other user interfaces allow it.

Sponsored by:	The FreeBSD Foundation
2019-06-26 00:03:37 +00:00
Alan Somers
788af9538a fusefs: automatically update mtime and ctime on write
Writing should implicitly update a file's mtime and ctime.  For fuse, the
server is supposed to do that.  But the client needs to do it too, because
the FUSE_WRITE response does not include time attributes, and it's not
desirable to issue a GETATTR after every WRITE.  When using the writeback
cache, there's another hitch: the kernel should ignore the mtime and ctime
fields in any GETATTR response for files with a dirty write cache.

Sponsored by:	The FreeBSD Foundation
2019-06-25 23:40:18 +00:00
Alan Somers
f2704f05cb fusefs: fix the tests for non-default values of MAXPHYS
Sponsored by:	The FreeBSD Foundation
2019-06-25 21:21:34 +00:00
Alan Somers
6ca3b02b1b fusefs: fix the tests for nondefault values of vfs.maxbcachebuf
Sponsored by:	The FreeBSD Foundation
2019-06-25 18:58:51 +00:00
Alan Somers
0d3a88d76c fusefs: writes should update the file size, even when data_cache_mode=0
Writes that extend a file should update the file's size.  r344185 restricted
that behavior for fusefs to only happen when the data cache was enabled.
That probably made sense at the time because the attribute cache wasn't
fully baked yet.  Now that it is, we should always update the cached file
size during write.

Sponsored by:	The FreeBSD Foundation
2019-06-25 18:36:11 +00:00
Alan Somers
b9e2019755 fusefs: rewrite vop_getpages and vop_putpages
Use the standard facilities for getpages and putpages instead of bespoke
implementations that don't work well with the writeback cache.  This has
several corollaries:

* Change the way we handle short reads _again_.  vfs_bio_getpages doesn't
  provide any way to handle unexpected short reads.  Plus, I found some more
  lock-order problems.  So now when the short read is detected we'll just
  clear the vnode's attribute cache, forcing the file size to be requeried
  the next time it's needed.  VOP_GETPAGES doesn't have any way to indicate
  a short read to the "caller", so we just bzero the rest of the page
  whenever a short read happens.

* Change the way we decide when to set the FUSE_WRITE_CACHE bit.  We now set
  it for clustered writes even when the writeback cache is not in use.

Sponsored by:   The FreeBSD Foundation
2019-06-25 17:24:43 +00:00
Alan Somers
48417ae0ba fusefs: fix multiple issues with the io tests
* During TearDown, close the test file before the backing file.  That way
  the backing file artifact will have the correct contents after the test
  completes.  It doesn't matter when running in Kyua, but it may when
  running the test manually.
* Add a closeopen operation that mimics what FSX does with the "-c" option.
* Skip mmap-related tests when vfs.fusefs.data_cache_mode == 0

Sponsored by:	The FreeBSD Foundation
2019-06-25 16:49:20 +00:00
Alan Somers
17575bad85 fusefs: improve the short read fix from r349279
VOP_GETPAGES intentionally tries to read beyond EOF, so fuse_read_biobackend
can't rely on bp->b_resid > 0 indicating a short read.  And adjusting
bp->b_count after a short read seems to cause some sort of resource leak.
Instead, store the shortfall in the bp->b_fsprivate1 field.

Sponsored by:	The FreeBSD Foundation
2019-06-24 17:05:31 +00:00
Li-Wen Hsu
01e92e2977 Skip sys.netinet.socket_afinet.socket_afinet_bind_zero temporarily because it
doesn't work when mac_portacl(4) loaded

PR:		238781
Sponsored by:	The FreeBSD Foundation
2019-06-23 19:37:12 +00:00
Alan Somers
aef22f2d75 fusefs: correctly handle short reads
A fuse server may return a short read for three reasons:

* The file is opened with FOPEN_DIRECT_IO.  In this case, the short read
  should be returned directly to userland.  We already handled this case
  correctly.

* The file was truncated server-side, and the read hit EOF.  In this case,
  the kernel should update the file size.  Fixed in the case of VOP_READ.
  Fixing this for VOP_GETPAGES is TODO.

* The file is opened in writeback mode, there are dirty buffers past what
  the server thinks is the file's EOF, and the read hit what the server
  thinks is the file's EOF.  In this case, the client is trying to read a
  hole, and should zero-fill it.  We already handled this case, and I added
  a test for it.

Sponsored by:	The FreeBSD Foundation
2019-06-21 21:44:31 +00:00
Alan Somers
87ff949a7b fusefs: raise protocol level to 7.23
None of the new features are implemented yet.  This commit just adds the new
protocol definitions and adds backwards-compatibility code for pre 7.23
servers.

Sponsored by:	The FreeBSD Foundation
2019-06-21 04:57:23 +00:00
Alan Somers
1f309e37f5 fusefs: update tests after r349260
r349260 removed some Linuxisms from the FUSE protocol header file in favor
of standard C99 types.  This change follows suit in the tests.

Sponsored by:	The FreeBSD Foundation
2019-06-21 04:37:11 +00:00
Alan Somers
7cbb8e8a06 fusefs: raise protocol level to 7.15
This protocol level adds two new features: the ability for the server to
store or retrieve data into/from the client's cache.  But the messages
aren't defined soundly since they identify the file only by its inode,
without the generation number.  So it's possible for them to modify the
wrong file's cache.  Also, I don't know of any file systems in ports that
use these messages.  So I'm not implementing them.  I did add a (disabled)
test for the store message, however.

Sponsored by:	The FreeBSD Foundation
2019-06-20 23:32:25 +00:00
Alan Somers
a1c9f4ad0d fusefs: implement VOP_BMAP
If the fuse daemon supports FUSE_BMAP, then use that for the block mapping.
Otherwise, use the same technique used by vop_stdbmap.  Report large values
for runp and runb in order to maximize read clustering and minimize upcalls,
even if we don't know the true layout.

The major result of this change is that sequential reads to FUSE files will
now usually happen 128KB at a time instead of 64KB.

Sponsored by:	The FreeBSD Foundation
2019-06-20 17:08:21 +00:00
Alan Somers
e532a99901 MFHead @349234
Sponsored by:	The FreeBSD Foundation
2019-06-20 15:56:08 +00:00
Alan Somers
84879e46c2 fusefs: multiple fixes related to the write cache
* Don't always write the last page synchronously.  That's not actually
  required.  It was probably just masking another bug that I fixed later,
  possibly in r349021.

* Enable the NotifyWriteback tests now that Writeback cache is working.

* Add a test to ensure that the write cache isn't flushed synchronously when
  in writeback mode.

Sponsored by:	The FreeBSD Foundation
2019-06-17 23:34:11 +00:00
Alan Somers
0482ec3e3f fusefs: run the Io tests with various combinations of mount options
Sponsored by:	The FreeBSD Foundation
2019-06-17 22:13:59 +00:00
Alan Somers
402b609c80 fusefs: use cluster_read for more readahead
fusefs will now use cluster_read.  This allows readahead of more than one
cache block.  However, it won't yet actually cluster the reads because that
requires VOP_BMAP, which fusefs does not yet implement.

Sponsored by:	The FreeBSD Foundation
2019-06-17 22:01:23 +00:00
Conrad Meyer
179f62805c random(4): Fortuna: allow increased concurrency
Add experimental feature to increase concurrency in Fortuna.  As this
diverges slightly from canonical Fortuna, and due to the security
sensitivity of random(4), it is off by default.  To enable it, set the
tunable kern.random.fortuna.concurrent_read="1".  The rest of this commit
message describes the behavior when enabled.

Readers continue to update shared Fortuna state under global mutex, as they
do in the status quo implementation of the algorithm, but shift the actual
PRF generation out from under the global lock.  This massively reduces the
CPU time readers spend holding the global lock, allowing for increased
concurrency on SMP systems and less bullying of the harvestq kthread.

It is somewhat of a deviation from FS&K.  I think the primary difference is
that the specific sequence of AES keys will differ if READ_RANDOM_UIO is
accessed concurrently (as the 2nd thread to take the mutex will no longer
receive a key derived from rekeying the first thread).  However, I believe
the goals of rekeying AES are maintained: trivially, we continue to rekey
every 1MB for the statistical property; and each consumer gets a
forward-secret, independent AES key for their PRF.

Since Chacha doesn't need to rekey for sequences of any length, this change
makes no difference to the sequence of Chacha keys and PRF generated when
Chacha is used in place of AES.

On a GENERIC 4-thread VM (so, INVARIANTS/WITNESS, numbers not necessarily
representative), 3x concurrent AES performance jumped from ~55 MiB/s per
thread to ~197 MB/s per thread.  Concurrent Chacha20 at 3 threads went from
roughly ~113 MB/s per thread to ~430 MB/s per thread.

Prior to this change, the system was extremely unresponsive with 3-4
concurrent random readers; each thread had high variance in latency and
throughput, depending on who got lucky and won the lock.  "rand_harvestq"
thread CPU use was high (double digits), seemingly due to spinning on the
global lock.

After the change, concurrent random readers and the system in general are
much more responsive, and rand_harvestq CPU use dropped to basically zero.

Tests are added to the devrandom suite to ensure the uint128_add64 primitive
utilized by unlocked read functions to specification.

Reviewed by:	markm
Approved by:	secteam(delphij)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D20313
2019-06-17 20:29:13 +00:00