* Fix the kernel build with gcc by removing a redundant extern declaration
* In the tests, fix a printf format specifier that assumed LP64
Sponsored by: The FreeBSD Foundation
Previously fusefs would never recycle vnodes. After VOP_INACTIVE, they'd
linger around until unmount or the vnlru reclaimed them. This commit
essentially actives and inlines the old reclaim_revoked sysctl, and fixes
some issues dealing with the attribute cache and multiply linked files.
Sponsored by: The FreeBSD Foundation
closing a file descriptor causes FUSE activity that is superfluous to the
purpose of most tests, but would nonetheless require matching expectations.
Rather than do that, most tests deliberately leak file descriptors instead.
This commit moves the leakage from each test into two trivial functions:
leak and leakdir. Hopefully Coverity will only complain about those
functions and not all of their callers.
Sponsored by: The FreeBSD Foundation
Now the io tests are run in all cache modes. The fusefs test suite can now
get adequate coverage without changing the value of
vfs.fusefs.data_cache_mode, which is only needed for legacy file systems
now.
Sponsored by: The FreeBSD Foundation
As of protocol 7.23, fuse file systems can specify their cache behavior on a
per-mountpoint basis. If they set FUSE_WRITEBACK_CACHE in
fuse_init_out.flags, then they'll get the writeback cache. If not, then
they'll get the writethrough cache. If they set FOPEN_DIRECT_IO in every
FUSE_OPEN response, then they'll get no cache at all.
The old vfs.fusefs.data_cache_mode sysctl is ignored for servers that use
protocol 7.23 or later. However, it's retained for older servers,
especially for those running in jails that lack access to the new protocol.
This commit also fixes two other minor test bugs:
* WriteCluster:SetUp was using an uninitialized variable.
* Read.direct_io_pread wasn't verifying that the cache was actually
bypassed.
Sponsored by: The FreeBSD Foundation
If a server supports a timestamp granularity other than 1ns, it can tell the
client this as of protocol 7.23. The client will use that granularity when
updating its cached timestamps during write. This way the timestamps won't
appear to change following flush.
Sponsored by: The FreeBSD Foundation
I originally thought that the kernel would be responsible for ctime in
protocol 7.23. But now I realize that's not the case. The server is
responsible for ctime. The kernel only sets it when there are dirty writes
cached, because that's when the server can't.
Sponsored by: The FreeBSD Foundation
As of r349396 the kernel will internally update the mtime and ctime of files
on write. It will also flush the mtime should a SETATTR happen before the
data cache gets flushed. Now it will flush the ctime too, if the server is
using protocol 7.23 or higher.
This is the only case in which the kernel will explicitly set a file's
ctime, since neither utimensat(2) nor any other user interfaces allow it.
Sponsored by: The FreeBSD Foundation
Writing should implicitly update a file's mtime and ctime. For fuse, the
server is supposed to do that. But the client needs to do it too, because
the FUSE_WRITE response does not include time attributes, and it's not
desirable to issue a GETATTR after every WRITE. When using the writeback
cache, there's another hitch: the kernel should ignore the mtime and ctime
fields in any GETATTR response for files with a dirty write cache.
Sponsored by: The FreeBSD Foundation
Writes that extend a file should update the file's size. r344185 restricted
that behavior for fusefs to only happen when the data cache was enabled.
That probably made sense at the time because the attribute cache wasn't
fully baked yet. Now that it is, we should always update the cached file
size during write.
Sponsored by: The FreeBSD Foundation
Use the standard facilities for getpages and putpages instead of bespoke
implementations that don't work well with the writeback cache. This has
several corollaries:
* Change the way we handle short reads _again_. vfs_bio_getpages doesn't
provide any way to handle unexpected short reads. Plus, I found some more
lock-order problems. So now when the short read is detected we'll just
clear the vnode's attribute cache, forcing the file size to be requeried
the next time it's needed. VOP_GETPAGES doesn't have any way to indicate
a short read to the "caller", so we just bzero the rest of the page
whenever a short read happens.
* Change the way we decide when to set the FUSE_WRITE_CACHE bit. We now set
it for clustered writes even when the writeback cache is not in use.
Sponsored by: The FreeBSD Foundation
* During TearDown, close the test file before the backing file. That way
the backing file artifact will have the correct contents after the test
completes. It doesn't matter when running in Kyua, but it may when
running the test manually.
* Add a closeopen operation that mimics what FSX does with the "-c" option.
* Skip mmap-related tests when vfs.fusefs.data_cache_mode == 0
Sponsored by: The FreeBSD Foundation
VOP_GETPAGES intentionally tries to read beyond EOF, so fuse_read_biobackend
can't rely on bp->b_resid > 0 indicating a short read. And adjusting
bp->b_count after a short read seems to cause some sort of resource leak.
Instead, store the shortfall in the bp->b_fsprivate1 field.
Sponsored by: The FreeBSD Foundation
A fuse server may return a short read for three reasons:
* The file is opened with FOPEN_DIRECT_IO. In this case, the short read
should be returned directly to userland. We already handled this case
correctly.
* The file was truncated server-side, and the read hit EOF. In this case,
the kernel should update the file size. Fixed in the case of VOP_READ.
Fixing this for VOP_GETPAGES is TODO.
* The file is opened in writeback mode, there are dirty buffers past what
the server thinks is the file's EOF, and the read hit what the server
thinks is the file's EOF. In this case, the client is trying to read a
hole, and should zero-fill it. We already handled this case, and I added
a test for it.
Sponsored by: The FreeBSD Foundation
None of the new features are implemented yet. This commit just adds the new
protocol definitions and adds backwards-compatibility code for pre 7.23
servers.
Sponsored by: The FreeBSD Foundation
r349260 removed some Linuxisms from the FUSE protocol header file in favor
of standard C99 types. This change follows suit in the tests.
Sponsored by: The FreeBSD Foundation
This protocol level adds two new features: the ability for the server to
store or retrieve data into/from the client's cache. But the messages
aren't defined soundly since they identify the file only by its inode,
without the generation number. So it's possible for them to modify the
wrong file's cache. Also, I don't know of any file systems in ports that
use these messages. So I'm not implementing them. I did add a (disabled)
test for the store message, however.
Sponsored by: The FreeBSD Foundation
If the fuse daemon supports FUSE_BMAP, then use that for the block mapping.
Otherwise, use the same technique used by vop_stdbmap. Report large values
for runp and runb in order to maximize read clustering and minimize upcalls,
even if we don't know the true layout.
The major result of this change is that sequential reads to FUSE files will
now usually happen 128KB at a time instead of 64KB.
Sponsored by: The FreeBSD Foundation
* Don't always write the last page synchronously. That's not actually
required. It was probably just masking another bug that I fixed later,
possibly in r349021.
* Enable the NotifyWriteback tests now that Writeback cache is working.
* Add a test to ensure that the write cache isn't flushed synchronously when
in writeback mode.
Sponsored by: The FreeBSD Foundation
fusefs will now use cluster_read. This allows readahead of more than one
cache block. However, it won't yet actually cluster the reads because that
requires VOP_BMAP, which fusefs does not yet implement.
Sponsored by: The FreeBSD Foundation
Add experimental feature to increase concurrency in Fortuna. As this
diverges slightly from canonical Fortuna, and due to the security
sensitivity of random(4), it is off by default. To enable it, set the
tunable kern.random.fortuna.concurrent_read="1". The rest of this commit
message describes the behavior when enabled.
Readers continue to update shared Fortuna state under global mutex, as they
do in the status quo implementation of the algorithm, but shift the actual
PRF generation out from under the global lock. This massively reduces the
CPU time readers spend holding the global lock, allowing for increased
concurrency on SMP systems and less bullying of the harvestq kthread.
It is somewhat of a deviation from FS&K. I think the primary difference is
that the specific sequence of AES keys will differ if READ_RANDOM_UIO is
accessed concurrently (as the 2nd thread to take the mutex will no longer
receive a key derived from rekeying the first thread). However, I believe
the goals of rekeying AES are maintained: trivially, we continue to rekey
every 1MB for the statistical property; and each consumer gets a
forward-secret, independent AES key for their PRF.
Since Chacha doesn't need to rekey for sequences of any length, this change
makes no difference to the sequence of Chacha keys and PRF generated when
Chacha is used in place of AES.
On a GENERIC 4-thread VM (so, INVARIANTS/WITNESS, numbers not necessarily
representative), 3x concurrent AES performance jumped from ~55 MiB/s per
thread to ~197 MB/s per thread. Concurrent Chacha20 at 3 threads went from
roughly ~113 MB/s per thread to ~430 MB/s per thread.
Prior to this change, the system was extremely unresponsive with 3-4
concurrent random readers; each thread had high variance in latency and
throughput, depending on who got lucky and won the lock. "rand_harvestq"
thread CPU use was high (double digits), seemingly due to spinning on the
global lock.
After the change, concurrent random readers and the system in general are
much more responsive, and rand_harvestq CPU use dropped to basically zero.
Tests are added to the devrandom suite to ensure the uint128_add64 primitive
utilized by unlocked read functions to specification.
Reviewed by: markm
Approved by: secteam(delphij)
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D20313
rename the source to gsb_crc32.c.
This is a prerequisite of unifying kernel zlib instances.
PR: 229763
Submitted by: Yoshihiro Ota <ota at j.email.ne.jp>
Differential Revision: https://reviews.freebsd.org/D20193
fusefs will now read ahead at most one cache block at a time (usually 64
KB). Clustered reads are still TODO. Individual file systems may disable
read ahead by setting fuse_init_out.max_readahead=0 during initialization.
Sponsored by: The FreeBSD Foundation
At a basic level, remove assumptions about the underlying algorithm (such as
output block size and reseeding requirements) from the algorithm-independent
logic in randomdev.c. Chacha20 does not have many of the restrictions that
AES-ICM does as a PRF (Pseudo-Random Function), because it has a cipher
block size of 512 bits. The motivation is that by generalizing the API,
Chacha is not penalized by the limitations of AES.
In READ_RANDOM_UIO, first attempt to NOWAIT allocate a large enough buffer
for the entire user request, or the maximal input we'll accept between
signal checking, whichever is smaller. The idea is that the implementation
of any randomdev algorithm is then free to divide up large requests in
whatever fashion it sees fit.
As part of this, two responsibilities from the "algorithm-generic" randomdev
code are pushed down into the Fortuna ra_read implementation (and any other
future or out-of-tree ra_read implementations):
1. If an algorithm needs to rekey every N bytes, it is responsible for
handling that in ra_read(). (I.e., Fortuna's 1MB rekey interval for AES
block generation.)
2. If an algorithm uses a block cipher that doesn't tolerate partial-block
requests (again, e.g., AES), it is also responsible for handling that in
ra_read().
Several APIs are changed from u_int buffer length to the more canonical
size_t. Several APIs are changed from taking a blockcount to a bytecount,
to permit PRFs like Chacha20 to directly generate quantities of output that
are not multiples of RANDOM_BLOCKSIZE (AES block size).
The Fortuna algorithm is changed to NOT rekey every 1MiB when in Chacha20
mode (kern.random.use_chacha20_cipher="1"). This is explicitly supported by
the math in FS&K §9.4 (Ferguson, Schneier, and Kohno; "Cryptography
Engineering"), as well as by their conclusion: "If we had a block cipher
with a 256-bit [or greater] block size, then the collisions would not
have been an issue at all."
For now, continue to break up reads into PAGE_SIZE chunks, as they were
before. So, no functional change, mostly.
Reviewed by: markm
Approved by: secteam(delphij)
Differential Revision: https://reviews.freebsd.org/D20312
Add some basic regression tests to verify behavior of both uint128
implementations at typical boundary conditions, to run on all architectures.
Test uint128 increment behavior of Chacha in keystream mode, as used by
'kern.random.use_chacha20_cipher=1' (r344913) to verify assumptions at edge
cases. These assumptions are critical to the safety of using Chacha as a
PRF in Fortuna (as implemented).
(Chacha's use in arc4random is safe regardless of these tests, as it is
limited to far less than 4 billion blocks of output in that API.)
Reviewed by: markm
Approved by: secteam(gordon)
Differential Revision: https://reviews.freebsd.org/D20392
Our fusefs(5) module supports three cache modes: uncached, write-through,
and write-back. However, the write-through mode (which is the default) has
never actually worked as its name suggests. Rather, it's always been more
like "write-around". It wrote directly, bypassing the cache. The cache
would only be populated by a subsequent read of the same data.
This commit fixes that problem. Now the write-through mode works as one
would expect: write(2) immediately adds data to the cache and then blocks
while the daemon processes the write operation.
A side effect of this change is that non-cache-block-aligned writes will now
incur a read-modify-write cycle of the cache block. The old behavior
(bypassing write cache entirely) can still be achieved by opening a file
with O_DIRECT.
PR: 237588
Sponsored by: The FreeBSD Foundation
Enable write clustering in fusefs whenever cache mode is set to writeback
and the "async" mount option is used. With default values for MAXPHYS,
DFLTPHYS, and the fuse max_write mount parameter, that means sequential
writes will now be written 128KB at a time instead of 64KB.
Also, add a regression test for PR 238565, a panic during unmount that
probably affects UFS, ext2, and msdosfs as well as fusefs.
PR: 238565
Sponsored by: The FreeBSD Foundation
An errant vfs_bio_clrbuf snuck in in r348931. Surprisingly, it doesn't have
any effect most of the time. But under some circumstances it cause the
buffer to behave in a write-only fashion.
Sponsored by: The FreeBSD Foundation
Implements the missing test cases for epair in a similar fashion to the
existing tests. Fixes shared abstractions to work with epair tests.
Submitted by: Ryan Moeller <ryan@freqlabs.com>
Reviewed by: asomers
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D20498
The current "writeback" cache mode, selected by the
vfs.fusefs.data_cache_mode sysctl, doesn't do writeback cacheing at all. It
merely goes through the motions of using buf(9), but then writes every
buffer synchronously. This commit:
* Enables delayed writes when the sysctl is set to writeback cacheing
* Fixes a cache-coherency problem when extending a file whose last page has
just been written.
* Removes the "sync" mount option, which had been set unconditionally.
* Adjusts some SDT probes
* Adds several new tests that mimic what fsx does but with more control and
without a real file system. As I discover failures with fsx, I add
regression tests to this file.
* Adds a test that ensures we can append to a file without reading any data
from it.
This change is still incomplete. Clustered writing is not yet supported,
and there are frequent "panic: vm_fault_hold: fault on nofault entry" panics
that I need to fix.
Sponsored by: The FreeBSD Foundation
In r348560 I thought that FUSE_EXPORT_SUPPORT was required for cases where
the node to be invalidated (or the parent of the entry to be invalidated)
wasn't cached. But I realize now that that's not the case. During entry
invalidation, if the parent isn't in the vfs hash table, then it must've
been reclaimed. And since fuse_vnop_reclaim does a cache_purge, that means
the entry to be invalidated has already been removed from the namecache.
And during inode invalidation, if the inode to be invalidated isn't in the
vfs hash table, then it too must've been reclaimed. In that case it will
have no buffer cache to invalidate.
Sponsored by: The FreeBSD Foundation
Protocol 7.12 adds a way for the server to notify the client that it should
invalidate an inode's data cache and/or attributes. This commit implements
that mechanism. Unlike Linux's implementation, ours requires that the file
system also supports FUSE_EXPORT_SUPPORT (NFS-style lookups). Otherwise the
invalidation operation will return EINVAL.
Sponsored by: The FreeBSD Foundation
Protocol 7.12 adds a way for the server to notify the client that it should
invalidate an entry from its name cache. This commit implements that
mechanism.
Sponsored by: The FreeBSD Foundation
FUSE allows entries to be cached for a limited amount of time. fusefs's
vnop_lookup method already implements that using the timeout functionality
of cache_lookup/cache_enter_time. However, lookups for the NFS server go
through a separate path: vfs_vget. That path can't use the same timeout
functionality because cache_lookup/cache_enter_time only work on pathnames,
whereas vfs_vget works by inode number.
This commit adds entry timeout information to the fuse vnode structure, and
checks it during vfs_vget. This allows the NFS server to take advantage of
cached entries. It's also the same path that FUSE's asynchronous cache
invalidation operations will use.
Sponsored by: The FreeBSD Foundation
The tests are failing because the return value and output have changed, but
before test code structure adjusted, removing these test cases help people
be able to focus on more important cases.
Discussed with: emaste
MFC with: r348206
Sponsored by: The FreeBSD Foundation
This commit raises the protocol level and adds backwards-compatibility code
to handle structure size changes. It doesn't implement any new features.
The new features added in protocol 7.12 are:
* server-side umask processing (which FreeBSD won't do)
* asynchronous inode and directory entry invalidation (which I'll do next)
Sponsored by: The FreeBSD Foundation
These fields are supposed to contain the file descriptor flags as supplied
to open(2) or set by fcntl(2). The feature is kindof useless on FreeBSD
since we don't supply all of these flags to fuse (because of the weak
relationship between struct file and struct vnode). But we should at least
set the access mode flags (O_RDONLY, etc).
This is the last fusefs change needed to get full protocol 7.9 support.
There are still a few options we don't support for good reason (mandatory
file locking is dumb, flock support is broken in the protocol until 7.17,
etc), but there's nothing else to do at this protocol level.
Sponsored by: The FreeBSD Foundation
If a FUSE file system sets the FUSE_POSIX_LOCKS flag then it can support
fcntl(2)-style locks directly. However, the protocol does not adequately
support flock(2)-style locks until revision 7.17. They must be implemented
locally in-kernel instead. This unfortunately breaks the interoperability
of fcntl(2) and flock(2) locks for file systems that support the former.
C'est la vie.
Prior to this commit flock(2) would get sent to the server as a
fcntl(2)-style lock with the lock owner field set to stack garbage.
Sponsored by: The FreeBSD Foundation
This bit tells the server that we're not sure which uid, gid, and/or pid
originated the write. I don't know of a single file system that cares, but
it's part of the protocol.
Sponsored by: The FreeBSD Foundation
* Prefer std::unique_ptr to raw pointers
* Prefer pass-by-reference to pass-by-pointer
* Prefer static_cast to C-style cast, unless it's too much typing
Reported by: ngie
Sponsored by: The FreeBSD Foundation
* Fix printf format strings on 32-bit OSes
* Fix -Wclass-memaccess violation on GCC-8 caused by using memset on an object
of non-trivial type.
* Fix memory leak in MockFS::init
* Fix -Wcast-align error on i386 in expect_readdir
* Fix some heterogenous comparison errors on 32-bit OSes.
Sponsored by: The FreeBSD Foundation
* Only build the tests on platforms with C++14 support
* Fix an undefined symbol error on lint builds
* Remove an unused function: fiov_clear
Sponsored by: The FreeBSD Foundation
If a daemon sets the FUSE_ASYNC_READ flag during initialization, then the
client is allowed to issue multiple concurrent reads for the same file
handle. Otherwise concurrent reads are not allowed. This commit implements
it. Previously we unconditionally disallowed concurrent reads.
Sponsored by: The FreeBSD Foundation
This commit adds the VOPs needed by userspace NFS servers (tested with
net/unfs3). More work is needed to make the in-kernel nfsd work, because of
its stateless nature. It doesn't open files prior to doing I/O. Also, the
NFS-related VOPs currently ignore the entry cache.
Sponsored by: The FreeBSD Foundation
When mounted with -o default_permissions and when
vfs.fusefs.data_cache_mode=2, fuse_io_strategy would try to clear the suid
bit after a successful write by a non-owner. When combined with a
not-yet-committed attribute-caching patch I'm working on, and if the
FUSE_SETATTR response indicates an unexpected filesize (legal, if the file
system has other clients), this would end up calling vtruncbuf. That would
panic, because the buffer lock was already held by bufwrite or bufstrategy
or something else upstack from fuse_vnop_strategy.
Sponsored by: The FreeBSD Foundation
to then try to reproduce a kernel panic, which turned out to be a
race condition and hard to test from here.
Commit the changes anywhere as the "bind zero" case was a surprise
to me and we should try to maintain this status.
Also it is easy examples someone can build upon.
With help from: markj
Event: Waterloo Hackathon 2019
I have contributed a number of changes to these tests over the past few
hundred revisions, and believe I deserve credit for the changes I have
made (plus, the copyright hadn't been updated since 2014).
MFC after: 1 week
This is not completely necessary today, but this change is being made in a
conservative manner to avoid accidental breakage in the future, if this ever
was a unicode string.
PR: 237403
MFC after: 1 week
In python 3, the default encoding was switched from ascii character sets to
unicode character sets in order to support internationalization by default.
Some interfaces, like ioctls and packets, however, specify data in terms of
non-unicode encodings formats, either in host endian (`fcntl.ioctl`) or
network endian (`dpkt`) byte order/format.
This change alters assumptions made by previous code where it was all
data objects were assumed to be basestrings, when they should have been
treated as byte arrays. In order to achieve this the following are done:
* str objects with encodings needing to be encoded as ascii byte arrays are
done so via `.encode("ascii")`. In order for this to work on python 3 in a
type agnostic way (as it anecdotally varied depending on the caller), call
`.encode("ascii")` only on str objects with python 3 to cast them to ascii
byte arrays in a helper function name `str_to_ascii(..)`.
* `dpkt.Packet` objects needing to be passed in to `fcntl.ioctl(..)` are done
so by casting them to byte arrays via `bytes()`, which calls
`dpkt.Packet__str__` under the covers and does the necessary str to byte array
conversion needed for the `dpkt` APIs and `struct` module.
In order to accomodate this change, apply the necessary typecasting for the
byte array literal in order to search `fop.name` for nul bytes.
This resolves all remaining python 2.x and python 3.x compatibility issues on
amd64. More work needs to be done for the tests to function with i386, in
general (this is a legacy issue).
PR: 237403
MFC after: 1 week
Tested with: python 2.7.16 (amd64), python 3.6.8 (amd64)
Even though some python styles suggest there should be multiple newlines between
methods/classes, for consistency with the surrounding code, it's best to be
consistent by having merely one newline between each functional block.
MFC after: 1 week
Make `KAT(CCM)?Parser` into a context suite-capable object by implementing
`__enter__` and `__exit__` methods which manage opening up the file descriptors
and closing them on context exit. This implementation was decided over adding
destructor logic to a `__del__` method, as there are a number of issues around
object lifetimes when dealing with threading cleanup, atexit handlers, and a
number of other less obvious edgecases. Plus, the architected solution is more
pythonic and clean.
Complete the iterator implementation by implementing a `__next__` method for
both classes which handles iterating over the data using a generator pattern,
and by changing `__iter__` to return the object instead of the data which it
would iterate over. Alias the `__next__` method to `next` when working with
python 2.x in order to maintain functional compatibility between the two major
versions.
As part of this work and to ensure readability, push the initialization of the
parser objects up one layer and pass it down to a helper function. This could
have been done via a decorator, but I was trying to keep it simple for other
developers to make it easier to modify in the future.
This fixes ResourceWarnings with python 3.
PR: 237403
MFC after: 1 week
Tested with: python 2.7.16 (amd64), python 3.6.8 (amd64)
In version 3.2+, `array.array(..).tostring()` was renamed to
`array.array(..).tobytes()`. Conditionally call `array.array(..).tobytes()` if
the python version is 3.2+.
PR: 237403
MFC after: 1 week
Replace uses of `foo.encode("hex")` with `binascii.hexlify(foo)` for forwards
compatibility between python 2.x and python 3.
PR: 237403
MFC after: 1 week
Python 3 no longer doesn't support encoding/decoding hexadecimal numbers using
the `str.format` method. The backwards compatible new method (using the
binascii module/methods) is a comparable means of converting to/from
hexadecimal format.
In short, the functional change is the following:
* `foo.decode('hex')` -> `binascii.unhexlify(foo)`
* `foo.encode('hex')` -> `binascii.hexlify(foo)`
While here, move the dpkt import in `cryptodev.py` down per PEP8, so it comes
after the standard library provided imports.
PR: 237403
MFC after: 1 week
If a user sets both atime and mtime to UTIME_NOW when calling a syscall like
utimensat(2), allow the server to choose what "now" means. Due to the
design of FreeBSD's VFS, it's not possible to do this for just one of atime
or mtime; it's all or none.
PR: 237181
Sponsored by: The FreeBSD Foundation
If the server sets fuse_attr.blksize to a nonzero value in the response to
FUSE_GETATTR, then the client should use that as the value for
stat.st_blksize .
Sponsored by: The FreeBSD Foundation
This commit upgrades the FUSE API to protocol 7.9 and adds unit tests for
backwards compatibility with servers built for version 7.8. It doesn't
implement any of 7.9's new features yet.
Sponsored by: The FreeBSD Foundation
As of r347410 IPSec is no longer built into GENERIC. The ipsec.ko module must
be loaded before we can execute the IPSec tests.
Check this, and skip the tests if IPSec is not available.
When using poll, kevent, or select there was a race window during which it
would be impossible to shut down the daemon. The problem was that poll,
kevent, and select don't return when the file descriptor gets closed (or
maybe it was that the file descriptor got closed before those syscalls were
entered?). The solution is to impose a timeout on those syscalls, and check
m_quit after they time out.
Sponsored by: The FreeBSD Foundation
Just like /dev/devctl, /dev/fuse will now report the number of operations
available for immediate read in the kevent.data field during kevent(2).
Sponsored by: The FreeBSD Foundation
/dev/fuse was already pollable with poll and select. Add support for
kqueue, too. And add tests for polling with poll, select, and kqueue.
Sponsored by: The FreeBSD Foundation
* In the fatal_signal test, wait for the daemon to receive FUSE_INTERRUPT
before exiting.
* Explicitly disable restarting syscalls after SIGUSR2. This fixes
intermittency in the priority test. I don't know why, but sometimes that
test's mkdir would be restarted, and sometimes it would return EINTR.
ERESTART should be the default.
* Remove a useless copy/pasted sleep in the priority test.
Sponsored by: The FreeBSD Foundation
If the daemon dies, return ENOTCONN for all operations that have already
been sent to the daemon, as well as any new ones.
Sponsored by: The FreeBSD Foundation
When a FUSE daemon dies or closes /dev/fuse, all of that daemon's pending
requests must be terminated. Previously that was done in /dev/fuse's
.d_close method. However, d_close only gets called on the *last* close of
the device. That means that if multiple daemons were running concurrently,
all but the last daemon to close would leave their I/O hanging around. The
problem was easily visible just by running "kyua -v parallelism=2 test" in
fusefs's test directory.
Fix this bug by terminating a daemon's pending I/O during /dev/fuse's
cdvpriv dtor method instead. That method runs on every close of a file.
Also, fix some potential races in the tests:
* Clear SA_RESTART when registering the daemon's signal handler so read(2)
will return EINTR.
* Wait for the daemon to die before unmounting the mountpoint, so we won't
see an unwanted FUSE_DESTROY operation in the mock file system.
Sponsored by: The FreeBSD Foundation
* Convert from plain to TAP for slightly improved introspection when skipping
the tests due to requirements not being met.
* Test for the net/py-dpkt (origin) package being required when running the
tests, instead of relying on a copy of the dpkt.py module from 2014. This
enables the tests to work with py3. Subsequently, remove
`tests/sys/opencrypto/dpkt.py(c)?` via `make delete-old`.
* Parameterize out `python2` as `$PYTHON`.
PR: 237403
MFC after: 1 week
Some fusefs tests must sleep because they deliberately trigger a race, or
because they're testing the cache timeout functionality. Consolidate the
sleep interval in a single place so it will be easy to adjust. Shorten it
from either 500ms or 250ms to 100ms. From experiment I find that 10ms works
every time, so 100ms should be fairly safe.
Sponsored by: The FreeBSD Foundation
Replace some sleeps with semaphore operations. Not all sleeps can be
replaced, though. Some are trying to lose a race.
Sponsored by: The FreeBSD Foundation
libfuse expects sockets to be created with FUSE_MKNOD, not FUSE_CREATE,
because that's how Linux does it. My first attempt at creating sockets
(r346894) used FUSE_CREATE because FreeBSD uses VOP_CREATE for this purpose.
There are no backwards-compatibility concerns with this change, because
socket support hasn't yet been merged to head.
Sponsored by: The FreeBSD Foundation
Any change to a directory's contents should cause its mtime and ctime to be
updated by the FUSE daemon. Clear its attribute cache so we'll get the new
attributs the next time that they're needed. This affects the following
VOPs: VOP_CREATE, VOP_LINK, VOP_MKDIR, VOP_MKNOD, VOP_REMOVE, VOP_RMDIR, and
VOP_SYMLINK
Reported by: pjdfstest
Sponsored by: The FreeBSD Foundation
If the file to be renamed is a directory and it's going to get a new parent,
then the user must have write permissions to that directory, because the
".." dirent must be changed.
Reported by: pjdfstest
Sponsored by: The FreeBSD Foundation
FUSE_LINK returns a new set of attributes. fusefs should cache them just
like it does during other VOPs. This is not only a matter of performance
but of correctness too; without caching the new attributes the vnode's nlink
value would be out-of-date.
Reported by: pjdfstest
Sponsored by: The FreeBSD Foundation
Even an unprivileged user should be able to chown a file to its current
owner, or chgrp it to its current group. Those are no-ops.
Reported by: pjdfstest
Sponsored by: The FreeBSD Foundation
ftruncate should succeed as long as the file descriptor is writable, even if
the file doesn't have write permission. This is important when combined
with O_CREAT.
Reported by: pjdfstest
Sponsored by: The FreeBSD Foundation
Don't allow unprivileged users to set SGID on files to whose group they
don't belong. This is slightly different than what POSIX says we should do
(clear sgid on return from a successful chmod), but it matches what UFS
currently does.
Reported by: pjdfstest
Sponsored by: The FreeBSD Foundation
When mounted with -o default_permissions fusefs is supposed to validate all
permissions in the kernel, not the file system. This commit fixes two
permissions that I had previously overlooked.
* Only root may chown a file
* Non-root users may only chgrp a file to a group to which they belong
PR: 216391
Sponsored by: The FreeBSD Foundation
This test had been disabled because it was designed to check protocol
7.9-specific functionality. Enable it without the 7.9-specific bit.
Sponsored by: The FreeBSD Foundation
An off-by-one error led to the last page of a write not being removed from
its object, even though that page's buffer was marked as invalid.
PR: 235774
Sponsored by: The FreeBSD Foundation
Though it's not documented, Linux will interpret a FUSE_INTERRUPT response
of ENOSYS as "the file system does not support FUSE_INTERRUPT".
Subsequently it will never send FUSE_INTERRUPT again to the same mount
point. This change matches Linux's behavior.
PR: 346357
Sponsored by: The FreeBSD Foundation
`xrange` is a pre-python 2.x compatible idiom. Use `range` instead. The values
being iterated over are sufficiently small that using range on python 2.x won't
be a noticeable issue.
MFC after: 2 months
Replace `except Environment, e:` with `except Environment as e` for
compatibility between python 2.x and python 3.x.
While here, fix a bad indentation change from r346620 by reindenting the code
properly.
MFC after: 2 months
From r346443:
"""
Replace hard tabs with four-character indentations, per PEP8.
This is being done to separate stylistic changes from the tests from functional
ones, as I accidentally introduced a bug to the tests when I used four-space
indentation locally.
No functional change.
"""
MFC after: 2 months
Discussed with: jhb
The CCM test vectors use a slightly different file format in that
there are global key-value pairs as well as section key-value pairs
that need to be used in each test. In addition, the sections can set
multiple key-value pairs in the section name. The CCM KAT parser
class is an iterator that returns a dictionary once per test where the
dictionary contains all of the relevant key-value pairs for a given
test (global, section name, section, test-specific).
Note that all of the CCM decrypt tests use nonce and tag lengths that
are not supported by OCF (OCF only supports a 12 byte nonce and 16
byte tag), so none of the decryption vectors are actually tested.
Reviewed by: ngie
MFC after: 1 month
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D19978
Pass in an explicit digest length to the Crypto constructor since it
was assuming only sessions with a MAC key would have a MAC. Passing
an explicit size allows us to test the full digest in HMAC tests as
well.
Reviewed by: cem
MFC after: 1 month
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D19884
This copes more gracefully when older version of the nist-kat package
are intalled that don't have newer test vectors such as CCM or plain
SHA.
If the nist-kat package is not installed at all, this still fails with
an error.
Reviewed by: cem
MFC after: 1 month
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D20034
r346162 factored out v_inval_buf_range from vtruncbuf, but it made an error
in the interface between the two. The result was a failure to remove
buffers past the first. Surprisingly, I couldn't reproduce the failure with
file systems other than fuse.
Also, modify fusefs's truncate_discards_cached_data test to catch this bug.
PR: 346162
Sponsored by: The FreeBSD Foundation
The zero-padding when printing out the Size field is on 32-bit architectures is
5, not 15. Adjust the regular expression to work with both the 32-bit and
64-bit case.
MFC after: 1 week
Reviewed by: lwhsu, markj
Approved by: emaste (mentor, implicit)
Differential Revision: https://reviews.freebsd.org/D20005
My wide sweeping stylistic change (while well intended) is impeding others from
working on `tests/sys/opencrypto`.
The plan is to revert the change in ^/head, then reintroduce the changes after
the other changes get merged into ^/head .
Approved by: emaste (mentor; implicit)
Requested by: jhb
MFC after: 2 months
Replace hard tabs with four-character indentations, per PEP8.
This is being done to separate stylistic changes from the tests from functional
ones, as I accidentally introduced a bug to the tests when I used four-space
indentation locally.
No functional change.
MFC after: 2 months
Approved by: emaste (mentor: implicit blanket approval for trivial fixes)
When interrupting a FUSE operation, send the FUSE_INTERRUPT op to the daemon
ASAP, ahead of other unrelated operations.
PR: 236530
Sponsored by: The FreeBSD Foundation
fusefs's VOP_SETEXTATTR calls uiomove(9) before blocking, so it can't be
restarted. It must be interrupted instead.
PR: 236530
Sponsored by: The FreeBSD Foundation
If a pending FUSE operation hasn't yet been sent to the daemon, then there's
no reason to inform the daemon that it's been interrupted. Instead, simply
remove it from the fuse message queue and set its status to EINTR or
ERESTART as appropriate.
PR: 346357
Sponsored by: The FreeBSD Foundation
* If a process receives a fatal signal while blocked on a fuse operation,
return ASAP without waiting for the operation to complete. But still send
the FUSE_INTERRUPT op to the daemon.
* Plug memory leaks from r346339
Interruptibility is now fully functional, but it could be better:
* Operations that haven't been sent to the server yet should be aborted
without sending FUSE_INTERRUPT.
* It would be great if write operations could be made restartable.
That would require delaying uiomove until the last possible moment, which
would be sometime during fuse_device_read.
* It would be nice if we didn't have to guess which EAGAIN responses were
for FUSE_INTERRUPT operations.
PR: 236530
Sponsored by: The FreeBSD Foundation
The test should fail if pf rules can't be set. This is helpful both
while writing tests and to verify that pfctl works as expected.
MFC after: 1 week
Event: Aberdeen hackathon 2019
The fuse protocol includes a FUSE_INTERRUPT operation that the client can
send to the server to indicate that it wants to abort an in-progress
operation. It's required to interrupt any syscall that is blocking on a
fuse operation.
This commit adds basic FUSE_INTERRUPT support. If a process receives any
signal while it's blocking on a FUSE operation, it will send a
FUSE_INTERRUPT and wait for the original operation to complete. But there
is still much to do:
* The current code will leak memory if the server ignores FUSE_INTERRUPT,
which many do. It will also leak memory if the server completes the
original operation before it receives the FUSE_INTERRUPT.
* An interrupted read(2) will incorrectly appear to be successful.
* fusefs should return immediately for fatal signals.
* Operations that haven't been sent to the server yet should be aborted
without sending FUSE_INTERRUPT.
* Test coverage should be better.
* It would be great if write operations could be made restartable.
That would require delaying uiomove until the last possible moment, which
would be sometime during fuse_device_read.
PR: 236530
Sponsored by: The FreeBSD Foundation
There was an issue with copyin() on DIOCRSETTFLAGS, which would panic if
pfrio_buffer was NULL.
Test for the issue fixed in r346319.
MFC after: 1 week
Event: Aberdeen hackathon 2019
fusefs's default cache mode is "writethrough", although it currently works
more like "write-around"; writes bypass the cache completely. Since writes
bypass the cache, they were leaving stale previously-read data in the cache.
This commit invalidates that stale data. It also adds a new global
v_inval_buf_range method, like vtruncbuf but for a range of a file.
PR: 235774
Reported by: cem
Sponsored by: The FreeBSD Foundation
For many FUSE opcodes, an error of ENOSYS has special meaning. fusefs
already handled some of those; this commit adds handling for the remainder:
* FUSE_FSYNC, FUSE_FSYNCDIR: ENOSYS means "success, and automatically return
success without calling the daemon from now on"
* All extattr operations: ENOSYS means "fail EOPNOTSUPP, and automatically
do it without calling the daemon from now on"
PR: 236557
Sponsored by: The FreeBSD Foundation
fusefs tracks each vnode's parent. The rename code was already correctly
updating it. Delete a comment that said otherwise, and add a regression
test for it.
Sponsored by: The FreeBSD Foundation
Don't panic if the server changes the file type of a file without us first
deleting it. That could indicate a buggy server, but it could also be the
result of one of several race conditions. Return EAGAIN as we do elsewhere.
Sponsored by: The FreeBSD Foundation
I got most of -o default_permissions working in r346088. This commit adds
sticky bit checks. One downside is that sometimes there will be an extra
FUSE_GETATTR call for the parent directory during unlink or rename. But in
actual use I think those attributes will almost always be cached.
PR: 216391
Sponsored by: The FreeBSD Foundation
fuse_vnop_lookup was using a FUSE_GETATTR operation when looking up "." and
"..", even though the only information it needed was the file type and file
size. "." and ".." are obviously always going to be directories; there's no
need to double check.
Sponsored by: The FreeBSD Foundation
* Eliminate fuse_access_param. Whatever it was supposed to do, it seems
like it was never complete. The only real function it ever seems to have
had was a minor performance optimization, which I've already eliminated.
* Make extended attribute operations obey the allow_other mount option.
* Allow unprivileged access to the SYSTEM extattr namespace when
-o default_permissions is not in use.
* Disallow setextattr and deleteextattr on read-only mounts.
* Add tests for a few more error cases.
Sponsored by: The FreeBSD Foundation
Normally all permission checking is done in the fuse server. But when -o
default_permissions is used, it should be done in the kernel instead. This
commit adds appropriate permission checks through fusefs when -o
default_permissions is used. However, sticky bit checks aren't working yet.
I'll handle those in a follow-up commit.
There are no checks for file flags, because those aren't supported by our
version of the FUSE protocol. Nor is there any support for ACLs, though
that could be added if there were any demand.
PR: 216391
Reported by: hiyorin@gmail.com
Sponsored by: The FreeBSD Foundation
The FUSE protocol includes a way for a server to tell the client that a
negative lookup response is cacheable for a certain amount of time.
PR: 236226
Sponsored by: The FreeBSD Foundation
1. Not all kernels have netmap(4) support. Check for netmap(4) support before
attempting to run the tests via the `PLAIN_REQUIRE_KERNEL_MODULE(..)` macro.
2. Libraries shouldn't be added to LDFLAGS; they should be added to LIBADD
instead. This allows the build system to evaluate dependencies for sanity.
3. Sort some of the Makefile variables per bsd.README.
1., in particular, will resolve failures when running this testcase on kernels
lacking netmap(4) support, e.g., the i386 GENERIC kernels on ^/stable/11 and
^/stable/12.
PR: 237129
Reviewed by: vmaffione
Approved by: emaste (mentor)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D19864
Follow-up to r346046. These two commits implement fuse cache timeouts for
both entries and attributes. They also remove the vfs.fusefs.lookup_cache
enable sysctl, which is no longer needed now that cache timeouts are
honored.
PR: 235773
Sponsored by: The FreeBSD Foundation
Final cleanup routines shouldn't be called from testcases; it should be called
from the testcase cleanup routine.
Furthermore, `geli_test_cleanup` should take care of cleaning up geli providers
and the memory disks used for the geli providers. `geli_test_cleanup` will always
be executed whereas the equivalent logic in `geli_test_body`, may not have been
executed if the test failed prior to the logic being run.
Prior to this change, the test case was trying to clean up `$md` twice: once in
at the end of the test case body function, and the other in the cleanup function.
The cleanup function logic was failing because there wasn't anything to clean up
in the cleanup function and the errors weren't being ignored.
This fixes FreeBSD test suite runs after r345864.
PR: 237128
Reviewed by: asomers, pjd
Approved by: emaste (mentor)
MFC with: r345864
Differential Revision: https://reviews.freebsd.org/D19854
The FUSE protocol allows the server to specify the timeout period for the
client's attribute and entry caches. This commit implements the timeout
period for the attribute cache. The entry cache's timeout period is
currently disabled because it panics, and is guarded by the
vfs.fusefs.lookup_cache_expire sysctl.
PR: 235773
Reported by: cem
Sponsored by: The FreeBSD Foundation
FUSE_LOOKUP, FUSE_GETATTR, FUSE_SETATTR, FUSE_MKDIR, FUSE_LINK,
FUSE_SYMLINK, FUSE_MKNOD, and FUSE_CREATE all return file attributes with a
cache validity period. fusefs will now cache the attributes, if the server
returns a non-zero cache validity period.
This change does _not_ implement finite attr cache timeouts. That will
follow as part of PR 235773.
PR: 235775
Reported by: cem
Sponsored by: The FreeBSD Foundation
Such processes will be reparented to the reaper when the current
parent is done with them (i.e., ptrace detached), so p_oppid must be
updated accordingly.
Add a regression test to exercise this code path. Previously it
would not be possible to reap an orphan with a stale oppid.
Reviewed by: kib, mjg
Tested by: pho
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19825
VOP_ACCESS was never fully implemented in fusefs. This change:
* Removes the FACCESS_DO_ACCESS flag, which pretty much disabled the whole
vop.
* Removes a quixotic special case for VEXEC on regular files. I don't know
why that was in there.
* Removes another confusing special case for VADMIN.
* Removes the FACCESS_NOCHECKSPY flag. It seemed to be a performance
optimization, but I'm unconvinced that it was a net positive.
* Updates test cases.
This change does NOT implement -o default_permissions. That will be handled
separately.
PR: 236291
Sponsored by: The FreeBSD Foundation
When -o allow_other is not in use, fusefs is supposed to prevent access to
the filesystem by any user other than the one who owns the daemon. Our
fusefs implementation was only enforcing that restriction at the mountpoint
itself. That was usually good enough because lookup usually descends from
the mountpoint. However, there are cases when it doesn't, such as when
using openat relative to a file beneath the mountpoint.
PR: 237052
Sponsored by: The FreeBSD Foundation
These tests were actually fixed by r345398, r345390 and r345392, but I
neglected to reenable them. Too bad googletest doesn't have the notion of
an Expected Failure like ATF does.
PR: 236474, 236473
Sponsored by: The FreeBSD Foundation
If a fuse file system returne FOPEN_KEEP_CACHE in the open or create
response, then the client is supposed to _not_ clear its caches for that
file. I don't know why clearing the caches would be the default given that
there's a separate flag to bypass the cache altogether, but that's the way
it is. fusefs(5) will now honor this flag.
Our behavior is slightly different than Linux's because we reuse file
handles. That means that open(2) wont't clear the cache if there's a
reusable file handle, even if the file server wouldn't have sent
FOPEN_KEEP_CACHE had we opened a new file handle like Linux does.
PR: 236560
Sponsored by: The FreeBSD Foundation
If a FUSE daemon returns FOPEN_DIRECT_IO when a file is opened, then it's
allowed to write less data than was requested during a FUSE_WRITE operation
on that file handle. fusefs should simply return a short write to userland.
The old code attempted to resend the unsent data. Not only was that
incorrect behavior, but it did it in an ineffective way, by attempting to
"rewind" the uio and uiomove the unsent data again.
This commit correctly handles short writes by returning directly to
userland if FOPEN_DIRECT_IO was set. If it wasn't set (making the short
write technically a protocol violation), then we resend the unsent data.
But instead of rewinding the uio, just resend the data that's already in the
kernel.
That necessitated a few changes to fuse_ipc.c to reduce the amount of bzero
activity. fusefs may be marginally faster as a result.
PR: 236381
Sponsored by: The FreeBSD Foundation
- init, init -R
- onetime, onetime -R
- 512 and 4k sectors
- encryption only
- encryption and authentication
- configure -r/-R for detached providers
- configure -r/-R for attached providers
- all keys allocated (10, 20 and 30MB provider sizes)
- keys allocated on demand (10, 20 and 30PB provider sizes)
- reading and writing to provider after expansion (10-30MB only)
- checking if metadata in old location is cleared.
Obtained from: Fudo Security
The original fusefs import, r238402, contained a bug in fuse_vnop_close that
could close a directory's file handle while there were still other open file
descriptors. The code looks deliberate, but there is no explanation for it.
This necessitated a workaround in fuse_vnop_readdir that would open a new
file handle if, "for some mysterious reason", that vnode didn't have any
open file handles. r345781 had the effect of causing the workaround to
panic, making the problem more visible.
This commit removes the workaround and the original bug, which also fixes
the panic.
Sponsored by: The FreeBSD Foundation
The FUSE protocol says that FUSE_FLUSH should be send every time a file
descriptor is closed. That's not quite possible in FreeBSD because multiple
file descriptors can share a single struct file, and closef doesn't call
fo_close until the last close. However, we can still send FUSE_FLUSH on
every VOP_CLOSE, which is probably good enough.
There are two purposes for FUSE_FLUSH. One is to allow file systems to
return EIO if they have an error when writing data that's cached
server-side. The other is to release POSIX file locks (which fusefs(5) does
not yet support).
PR: 236405, 236327
Sponsored by: The FreeBSD Foundation
During truncate, fusefs was discarding entire cached blocks, but it wasn't
zeroing out the unused portion of a final partial block. This resulted in
reads returning stale data.
PR: 233783
Reported by: fsx
Sponsored by: The FreeBSD Foundation
Better Makefile syntax.
Note that this commit is to the project branch, but the review concerns the
merge to head.
Sponsored by: The FreeBSD Foundation
This change takes capsicum-test from upstream and applies some local changes to make the
tests work on FreeBSD when executed via Kyua.
The local modifications are as follows:
1. Make `OpenatTest.WithFlag` pass with the new dot-dot lookup behavior in FreeBSD 12.x+.
2. capsicum-test references a set of helper binaries: `mini-me`, `mini-me.noexec`, and
`mini-me.setuid`, as part of the execve/fexecve tests, via execve, fexecve, and open.
It achieves this upstream by assuming `mini-me*` is in the current directory, however,
in order for Kyua to execute `capsicum-test`, it needs to provide a full path to
`mini-me*`. In order to achieve this, I made `capsicum-test` cache the executable's
path from argv[0] in main(..) and use the cached value to compute the path to
`mini-me*` as part of the execve/fexecve testcases.
3. The capsicum-test test suite assumes that it's always being run on CAPABILITIES enabled
kernels. However, there's a chance that the test will be run on a host without a
CAPABILITIES enabled kernel, so we must check for the support before running the tests.
The way to achieve this is to add the relevant `feature_present("security_capabilities")`
check to SetupEnvironment::SetUp() and skip the tests when the support is not available.
While here, add a check for `kern.trap_enotcap` being enabled. As noted by markj@ in
https://github.com/google/capsicum-test/issues/23, this sysctl being enabled can trigger
non-deterministic failures. Therefore, the tests should be skipped if this sysctl is
enabled.
All local changes have been submitted to the capsicum-test project
(https://github.com/google/capsicum-test) and are in various stages of review.
Please see the following pull requests for more details:
1. https://github.com/google/capsicum-test/pull/35
2. https://github.com/google/capsicum-test/pull/41
3. https://github.com/google/capsicum-test/pull/42
Reviewed by: asomers
Discussed with: emaste, markj
Approved by: emaste (mentor)
MFC after: 2 months
Differential Revision: https://reviews.freebsd.org/D19758
By default, FUSE performs authorization in the server. That means that it's
insecure for the client to reuse FUSE file handles between different users,
groups, or processes. Linux handles this problem by creating a different
FUSE file handle for every file descriptor. FreeBSD can't, due to
differences in our VFS design.
This commit adds credential information to each fuse_filehandle. During
open(2), fusefs will now only reuse a file handle if it matches the exact
same access mode, pid, uid, and gid of the calling process.
PR: 236844
Sponsored by: The FreeBSD Foundation
O_EXEC is useful for fexecve(2) and fchdir(2). Treat it as another fufh
type alongside the existing RDONLY, WRONLY, and RDWR. Prior to r345742 this
would've caused a memory and performance penalty.
PR: 236329
Sponsored by: The FreeBSD Foundation
This test shows how bug 236844 can lead to a privilege escalation when used
with the -o allow_other mount option.
PR: 236844
Sponsored by: The FreeBSD Foundation
Previously fusefs would treat any file opened O_WRONLY as though the
FOPEN_DIRECT_IO flag were set, in an attempt to avoid issuing reads as part
of a RMW write operation on a cached part of the file. However, the FUSE
protocol explicitly allows reads of write-only files for precisely that
reason.
Sponsored by: The FreeBSD Foundation
VOP_GETPAGES is disabled when vfs.fusefs.data_cache_mode=0, causing mmap to
return success but accessing the mapped memory will subsequently segfault.
Sponsored by: The FreeBSD Foundation
Surprisingly, open(..., O_WRONLY | O_CREAT, 0444) should work. POSIX
requires it. But it didn't work in early FUSE implementations. Add a
regression test so that our FUSE driver doesn't make the same mistake.
Sponsored by: The FreeBSD Foundation
fuse_vnop_create must close the newly created file if it can't allocate a
vnode. When it does so, it must use the same file flags for FUSE_RELEASE as
it used for FUSE_OPEN or FUSE_CREATE.
Reported by: Coverity
Coverity CID: 1066204
Sponsored by: The FreeBSD Foundation
The test could occasionally hang if the parent's SIGUSR2 signal arrived
before the child had pause()d. Using POSIX semaphores precludes that
possibility.
Sponsored by: The FreeBSD Foundation
If a FUSE filesystem returns ENOSYS for FUSE_CREATE, then fallback to
FUSE_MKNOD/FUSE_OPEN.
Also, fix a memory leak in the error path of fuse_vnop_create. And do a
little cleanup in fuse_vnop_open.
PR: 199934
Reported by: samm@os2.kiev.ua
Sponsored by: The FreeBSD Foundation
The FUSE protocol allows for LOOKUP to return a cacheable negative response,
which means that the file doesn't exist and the kernel can cache its
nonexistence. As of this commit fusefs doesn't cache the nonexistence, but
it does correctly handle such responses. Prior to this commit attempting to
create a file, even with O_CREAT would fail with ENOENT if the daemon
returned a cacheable negative response.
PR: 236231
Sponsored by: The FreeBSD Foundation
For an unknown reason, fusefs was _always_ sending the fdatasync operation
instead of fsync. Now it correctly sends one or the other.
Also, remove the Fsync.fsync_metadata_only test, along with the recently
removed Fsync.nop. They should never have been added. The kernel shouldn't
keep track of which files have dirty data; that's the daemon's job.
PR: 236473
Sponsored by: The FreeBSD Foundation
It's sufficient to check for /dev/fuse. And due to bug 236647, the module
could be named either fuse or fusefs.
PR: 236647
Sponsored by: The FreeBSD Foundation
Also, fix one of the default_permissions test cases. I forgot the
expectation for FUSE_ACCESS, because that doesn't work right now.
Sponsored by: The FreeBSD Foundation
Now the entire fuse test suite can "pass", or at least not fail. Skipped
tests are reported to Kyua as passes, because googletest is still using
Kyua's plain test adapter.
Sponsored by: The FreeBSD Foundation
Sometimes the fuse daemon doesn't die as soon as its /dev/fuse file
descriptor is closed; it needs to be unmounted first.
Sponsored by: The FreeBSD Foundation
Revision r345269 introduced changes that triggered a regression on netmap
unit tests (tests/sys/netmap/ctrl-api-test.c).
This change updates the unit tests to remove the regression.
Reported by: lwhsu
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D19639
This commit adds tests for the default_permissions and push_symlinks_in
mount options. It doesn't add tests for allow_other, because I'm not sure
how that will interact with Kyua (the test will need to drop privileges).
All of the other mount options are undocumented.
PR: 216391
Sponsored by: The FreeBSD Foundation
This mutes the duplicate target warning emitted via bsd.files.mk each build.
MFC after: 1 week
Reviewed by: asomers
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D19603
* Test that FUSE_FLUSH and FUSE_RELEASE release POSIX file locks
* Test that FUSE_SETATTR's attr caching feature works
* Fix some minor mistakes in the posix file lock tests
Sponsored by: The FreeBSD Foundation
This required changing the way that all operations are mocked. Previously
MockFS::process had one input argument and one output argument. Now, it
returns a vector of zero or more responses. This allows tests to simulate
conditions where the filesystem daemon has a queue depth > 1.
PR: 236530
Sponsored by: The FreeBSD Foundation
Make the tests run slightly faster by having pft_ping.py end the capture
of packets as soon as it sees the expected packet, rather than
continuing to sniff.
MFC after: 2 weeks
There was a problem destroying renamed tun interfaces in vnet jails. This was
fixed in r344794. Test the previously failing scenario.
PR: 235704
MFC after: 2 weeks
The netipsec and pf tests have a number of common test functions. These
used to be duplicated, but it makes more sense for them to re-use the
common functions.
PR: 236223
This is marginally faster than using an environment check in each test case.
Also, if the global check fails then all of the tests are skipped. Oddly,
it's not possible to skip a test in any other way.
Also, allow the test to run as a normal user if vfs.usermount=1 and
/dev/fuse is accessible.
Reported by: ngie
Sponsored by: The FreeBSD Foundation
It only tests the kernel portion of fuse, not the userspace portion (which
comes from sysutils/fusefs-libs). The kernel-userspace interface is
de-facto standardized, and this test suite seeks to validate FreeBSD's
implementation.
It uses GoogleMock to substitute for a userspace daemon and validate the
kernel's behavior in response to filesystem access. GoogleMock is
convenient because it can validate the order, number, and arguments of each
operation, and return canned responses.
But that also means that the test suite must use GoogleTest, since
GoogleMock is incompatible with atf-c++ and atf.test.mk does not allow C++
programs to use atf-c.
This commit adds the first 10 test cases out of an estimated 130 total.
PR: 235775, 235773
Sponsored by: The FreeBSD Foundation
Generate a fragmented packet with different header chains, to provoke
the incorrect behaviour of pf.
Without the fix this will trigger a panic.
Obtained from: Corentin Bayet, Nicolas Collignon, Luca Moro at Synacktiv
pfctl has an issue with 'set skip on <group>', which causes inconsistent
behaviour: the set skip directive works initially, but does not take
effect when the same rules are re-applied.
PR: 229241
MFC after: 1 week
When building with KCOV enabled the compiler will insert function calls
to probes allowing us to trace the execution of the kernel from userspace.
These probes are on function entry (trace-pc) and on comparison operations
(trace-cmp).
Userspace can enable the use of these probes on a single kernel thread with
an ioctl interface. It can allocate space for the probe with KIOSETBUFSIZE,
then mmap the allocated buffer and enable tracing with KIOENABLE, with the
trace mode being passed in as the int argument. When complete KIODISABLE
is used to disable tracing.
The first item in the buffer is the number of trace event that have
happened. Userspace can write 0 to this to reset the tracing, and is
expected to do so on first use.
The format of the buffer depends on the trace mode. When in PC tracing just
the return address of the probe is stored. Under comparison tracing the
comparison type, the two arguments, and the return address are traced. The
former method uses on entry per trace event, while the later uses 4. As
such they are incompatible so only a single mode may be enabled.
KCOV is expected to help fuzzing the kernel, and while in development has
already found a number of issues. It is required for the syzkaller system
call fuzzer [1]. Other kernel fuzzers could also make use of it, either
with the current interface, or by extending it with new modes.
A man page is currently being worked on and is expected to be committed
soon, however having the code in the kernel now is useful for other
developers to use.
[1] https://github.com/google/syzkaller
Submitted by: Mitchell Horne <mhorne063@gmail.com> (Earlier version)
Reviewed by: kib
Testing by: tuexen
Sponsored by: DARPA, AFRL
Sponsored by: The FreeBSD Foundation (Mitchell Horne)
Differential Revision: https://reviews.freebsd.org/D14599
Import the unit tests from upstream (https://github.com/luigirizzo/netmap
ba02539859d46d33), and make them ready for use with Kyua.
There are currently 38 regression tests, which test the kernel control ABI
exposed by netmap to userspace applications:
1: test for port info get
2-5: tests for basic port registration
6-9: tests for VALE
10-11: tests for getting netmap allocator info
12-15: tests for netmap pipes
16: test on polling mode
17-18: tests on options
19-27: tests for sync-kloop subsystem
28-39: tests for null ports
31-38: tests for the legacy NIOCREGIF registers
Reviewed by: ngie
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18490
MK_AUDIT already controls auditd(8), praudit(1), etc. It should also control
the audit test suite.
Submitted by: ngie
MFC after: 2 weeks
Pull Request: https://github.com/freebsd/freebsd/pull/240
These tests should be skipped if /etc/rc.d/auditd is missing, which could be
the case if world was built with WITHOUT_AUDIT set. Also, one test case
requires /etc/rc.d/accounting.
Submitted by: ngie
MFC after: 2 weeks
Pull Request: https://github.com/freebsd/freebsd/pull/240
It's been reported that pf doesn't handle running out of available ports
for NAT correctly. It freezes until a state expires and it can find a
free port.
Test for this, by setting up a situation where only two ports are
available for NAT and then attempting to create three connections.
If successful the third connection will fail immediately. In an
incorrect case the connection attempt will freeze, also freezing all
interaction with pf through pfctl and trigger timeout.
PR: 233867
MFC after: 2 weeks
Use ATF_TC_CLEANUP(), because that means the cleanup code will get
called even if a test fails. Before it would only be executed if every
test within the body succeeded.
Reported by: Marie Helene Kvello-Aune <marieheleneka@gmail.com>
MFC after: 2 weeks
Explicitly mark these tests as requiring root rights. We need to be able
to open /dev/pf.
Reported by: Marie Helene Kvello-Aune <marieheleneka@gmail.com>
MFC after: 2 weeks
Once a signal's siginfo was copied to 'td_si' as part of the signal
exchange in issignal(), it was never cleared. This caused future
thread events that are reported as SIGTRAP events without signal
information to report the stale siginfo in 'td_si'. For example, if a
debugger created a new process and used SIGSTOP to stop it after
PT_ATTACH, future system call entry / exit events would set PL_FLAG_SI
with the SIGSTOP siginfo in pl_siginfo. This broke 'catch syscall' in
current versions of gdb as it assumed PL_FLAG_SI with SIGTRAP
indicates a breakpoint or single step trap.
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18487
Re-apply r341665 with format strings fixed.
If we happen to taste a stale mirror component first, don't reject valid,
newer components that have differing metadata from the stale component
(during STARTING). Instead, update our view of the most recent metadata as
we taste components.
Like mediasize beforehand, remove some checks from g_mirror_check_metadata
which would evict valid components due to metadata that can change over a
mirror's lifetime. g_mirror_check_metadata is invoked long before we check
genid/syncid and decide which component(s) are newest and whether or not we
have quorum.
Before checking if we can enter RUNNING (i.e., we have quorum) after a NEW
component is added, first remove any known stale or inconsistent disks from
the mirrorset, rather than removing them *after* deciding we have quorum.
Check if we have quorum after removing these components.
Additionally, add a knob, kern.geom.mirror.launch_mirror_before_timeout, to
force gmirrors to wait out the full timeout (kern.geom.mirror.timeout)
before transitioning from STARTING to RUNNING. This is a kludge to help
ensure all eligible, boot-time available mirror components are tasted before
RUNNING a gmirror.
Add a basic test case for STARTING -> RUNNING startup behavior around stale
genids.
PR: 232671, 232835
Submitted by: Cindy Yang <cyang AT isilon.com> (previous version)
Reviewed by: markj (kernel portions)
Discussed with: asomers, Cindy Yang
Tested by: pho
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D18062
r341392 changed common test cleanup routines in a way that allowed them to
be used by TAP tests as well as ATF tests. However, a late change made
during code review resulted in cleanup being broken for ATF tests, which
source geom_subr.sh separately during the body and cleanup phases of the
test. The result was that md(4) devices wouldn't get cleaned up.
MFC after: 2 weeks
X-MFC-With: 341392
If we happen to taste a stale mirror component first, don't reject valid,
newer components that have differing metadata from the stale component
(during STARTING). Instead, update our view of the most recent metadata as
we taste components.
Like mediasize beforehand, remove some checks from g_mirror_check_metadata
which would evict valid components due to metadata that can change over a
mirror's lifetime. g_mirror_check_metadata is invoked long before we check
genid/syncid and decide which component(s) are newest and whether or not we
have quorum.
Before checking if we can enter RUNNING (i.e., we have quorum) after a NEW
component is added, first remove any known stale or inconsistent disks from
the mirrorset, rather than removing them *after* deciding we have quorum.
Check if we have quorum after removing these components.
Additionally, add a knob, kern.geom.mirror.launch_mirror_before_timeout, to
force gmirrors to wait out the full timeout (kern.geom.mirror.timeout)
before transitioning from STARTING to RUNNING. This is a kludge to help
ensure all eligible, boot-time available mirror components are tasted before
RUNNING a gmirror.
When we are instructed to forget mirror components, bump the generation id
to avoid confusion with such stale components later.
Add a basic test case for STARTING -> RUNNING startup behavior around stale
genids.
PR: 232671, 232835
Submitted by: Cindy Yang <cyang AT isilon.com> (previous version)
Reviewed by: markj (kernel portions)
Discussed with: asomers, Cindy Yang
Tested by: pho
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D18062
The problem with the logic prior to this commit was twofold:
1. The wrong set of idioms (TAP-compatible) were being applied to the ATF
testcases when run, resulting in confusing ATF failure results on setup.
2. The cleanup subroutines were broken when the geom classes could not be
loaded as they exited with 0 unexpectedly.
This commit changes the test code to source the class-specific configuration
(conf.sh) once globally, instead of sourcing it per testcase and per cleanup
subroutine, and to call the ATF-specific setup subroutine(s) inline in
the testcases.
The refactoring done is effectively a no-op for the TAP testcases, modulo
any refactoring done to create common code between the ATF and TAP
testcases.
This unbreaks the geli testcases converted to ATF in r327662 and r327683,
and the gmirror testcases added in r327780, respectively, when the geom
class could not be loaded.
tests/sys/geom/class/mirror/...
While here, ignore errors when turning debug failpoint sysctl off, which
could occur if the gmirror class was not loaded.
Submitted by: ngie
MFC after: 2 weeks
Pull Request: https://github.com/freebsd/freebsd/pull/241
Fix reporting of SS_ONSTACK in nested signal delivery when sigaltstack()
is used on some architectures.
Add a unit test for this. I tested the test by introducing the bug
on amd64. I did not test it on other architectures.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D18347
After r337820, which "corrected" some spaces-instead-of-tab whitespace
issues in the libkqueue tests, jmg@ pointed out that these files were
originally space-based, not tab-spaced, and so the correction should
have been to get rid of the tabs that had been introduced in previous
changes, not the spaces. This change does that. This is a whitespace
only change; no functional change is intended.
Reported by: jmg@
MFC after: 3 days
Sponsored by: Dell EMC Isilon
Replace hard-coded epair0b with the variable holds the real epair interface
used for testing.
Reviewed by: kp
Approved by: emaste, markj (mentors)
MFC with: r339836
Sponsored by: The FreeBSD Foundation
Set up two jails, configure pfsync between them and create state in one
of them, verify that this state is copied to the other jail.
MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D17504
Unconditionally reparenting to PID 1 breaks the procctl(2) reaper
functionality.
Add a regression test for this case.
Reviewed by: kib
Approved by: re (gjb)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17589
Originally, these tests accidentally used broadcast addresses when they
should've used unicast addresses. That the tests passed prior to r337736
was accidental.
Submitted by: ae
Reviewed by: olivier
MFC after: 2 weeks