Fix the netinet/netinet6 divert tests falsely reporting 'ipdivert module is
not loaded' when the divert module is built into the kernel
Sponsored by: Axiado
Differential Revision: https://reviews.freebsd.org/D25026
This patch fixes two issues relating to FUSE_ACCESS when the
default_permissions mount option is disabled:
* VOP_ACCESS() calls with VADMIN set should never be sent to a fuse server
in the form of FUSE_ACCESS operations. The FUSE protocol has no equivalent
of VADMIN, so we must evaluate such things kernel-side, regardless of the
default_permissions setting.
* The FUSE protocol only requires FUSE_ACCESS to be sent for two purposes:
for the access(2) syscall and to check directory permissions for
searchability during lookup. FreeBSD sends it much more frequently, due to
differences between our VFS and Linux's, for which FUSE was designed. But
this patch does eliminate several cases not required by the FUSE protocol:
* for any FUSE_*XATTR operation
* when creating a new file
* when deleting a file
* when setting timestamps, such as by utimensat(2).
* Additionally, when default_permissions is disabled, this patch removes one
FUSE_GETATTR operation when deleting a file.
PR: 245689
Reported by: MooseFS FreeBSD Team <freebsd@moosefs.pro>
Reviewed by: cem
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24777
When a FUSE operation other than LOOKUP returns ENOENT, the kernel will
reclaim that vnode, resuling in a FUSE_FORGET being sent a short while
later. Many of the ENOENT tests weren't expecting those FUSE_FORGET
operations. They usually passed by luck since FUSE_FORGET is often delayed.
This commit adds appropriate expectations.
MFC after: 2 weeks
common_init_tbl is only used within this single CU, so it should be marked
static.
WARNS=6 also complained about the var defined by
`ATF_TC_WITH_CLEANUP(getastats);` being unused, which turns out to be
because it's not been hooked up in ATF_TP_ADD_TCS. kp@ did not immediately
recall any reason for this, and the case passes on my local system, so hook
it up.
Note that I've not yet set WARNS= 6 here. Investigation is underway to see
if we can feasibly default WARNS to 6 for src builds to catch directories
too deep to inherit a WARNS from the top-level subdirectories' Makefile.inc.
Those particular WARNS settings will be subsequently removed as they become
redundant with a more-global default.
MFC after: 1 week
The test makefiles will handle setting mode bits during install. Also,
Phabricator gets upset when uploading an executable plain-text file
without a shebang.
MFC after: 1 week
These two errors have been present since the tests' introduction.
Coincidentally every test (I think there's only one) that cares about that
field also works when the field's value is 0.
MFC after: 2 weeks
This test uses a gnop feature (delay probability) that isn't available on
stable/12. But it's unnecessary; the test works fine without it. Removing
it simplifies the test and, once MFCed, will allow it to pass on stable/12.
PR: 244158
Reported by: lwhsu
MFC after: 2 weeks
mac_bsdextended(4), when enabled, causes ordinary operations to send many
more VOP_GETATTRs to file system. The fusefs tests expectations aren't
written with those in mind. Optionally expecting them would greatly
obfuscate the fusefs tests. Worse, certain fusefs functionality (like
attribute caching) would be impossible to test if the tests couldn't expect
an exact number of GETATTR operations.
This commit resolves that conflict by making two changes:
1. The fusefs tests will now check for mac_bsdextended, and skip if it's
enabled.
2. The mac_bsdextended tests will now check whether the module is enabled, not
merely loaded. If it's loaded but disabled, the tests will automatically
enable it for the duration of the tests.
With these changes, a CI system can achieve best coverage by loading both
fusefs and mac_bsdextended at boot, and setting
security.mac.bsdextended.enabled=0
PR: 244229
Reported by: lwhsu
Reviewed by: cem
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24577
This removes support for the following algorithms:
- ARC4
- Blowfish
- CAST128
- DES
- 3DES
- MD5-HMAC
- Skipjack
Since /dev/crypto no longer supports 3DES, stop testing the 3DES KAT
vectors in cryptotest.py.
Reviewed by: cem (previous version)
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24346
We used to have an issue with recursive locking with
net.link.bridge.inherit_mac. This causes us to send an ARP request while
we hold the BRIDGE_LOCK, which used to cause us to acquire the
BRIDGE_LOCK again. We can't re-acquire it, so this caused a panic.
Now that we no longer need to acquire the BRIDGE_LOCK for
bridge_transmit() this should no longer panic. Test this.
PR: 216510
Reviewed by: emaste, philip
MFC after: 2 months
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24251
The new tests have more complete setup and cleanup, are more granular, and
correctly annotate expected failures and skipped tests. A follow-up commit
will resolve a conflict with the fusefs tests (bug 244229).
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24257
- Lookup device drivers to test by name instead of assuming that the
software / hardware flags will select specific drivers.
- Set the sysctl to permit software /dev/crypto requests when testing
the accelerated software blake2 driver.
PR: 245825
Reported by: lwhsu
Reviewed by: cem, lwhsu
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24540
There were ultimately two separate problems here:
- a 32-bit long cannot represent microseconds since 1970 (noted by ian)
- time_t is 32-bit on i386, so now() was wrong anyways even with the correct
return type.
For the first, just explicitly use a uint64_t for now() and all of the
callers. For the second, we need to explicitly cast tv_sec to uint64_t
before it gets multiplied in the SEC_TO_US macro. Casting this instance
rather than generally in the macro was arbitrarily chosen simply because all
other uses are converting small relative time values.
The tests now pass on i386, at least; presumably other ILP32 will be fine
now as well.
We used to have a problem where bridges created in different vnet jails
would end up having the same mac address. This is now fixed by
including the jail name as a seed for the mac address generation, but we
should verify that it doesn't regress.
Originally noticed while attempting to run the kqueue tests under
qemu-user-static, this apparently just happens sometimes when running in a
jail in general -- the timer will fire off "too early," but it's really just
the result of imprecise measurements (noted by cem).
Kicking this over to NOTE_USECONDS still tests the correct thing while
allowing it to work more consistently; a basic sanity test reveals that we
often end up coming in just less than 200 microseconds after the timer
fired off.
MFC after: 3 days
closefrom has been converted to close_range internally; remediation is
underway for this, marking it as an expected fail for now while proper
course is determined.
PR: 245625
Similar to mmap'ing vnodes, posixshm should count any mapping where maxprot
contains VM_PROT_WRITE (i.e. fd opened r/w with no write-seal applied) as
writable and thus blocking of any write-seal.
The memfd tests have been amended to reflect the fixes here, which notably
includes:
1. Fix for error return bug; EPERM is not a documented failure mode for mmap
2. Fix rejection of write-seal with active mappings that can be upgraded via
mprotect(2).
Reported by: markj
Discussed with: markj, kib
close_range will clamp the range between [0, fdp->fd_lastfile], but failed
to take into account that fdp->fd_lastfile can become -1 if all fds are
closed. =-( In this scenario, just return because there's nothing further we
can do at the moment.
Add a test case for this, fork() and simply closefrom(0) twice in the child;
on the second invocation, fdp->fd_lastfile == -1 and will trigger a panic
before this change.
X-MFC-With: r359836
close_range(min, max, flags) allows for a range of descriptors to be
closed. The Python folk have indicated that they would much prefer this
interface to closefrom(2), as the case may be that they/someone have special
fds dup'd to higher in the range and they can't necessarily closefrom(min)
because they don't want to hit the upper range, but relocating them to lower
isn't necessarily feasible.
sys_closefrom has been rewritten to use kern_close_range() using ~0U to
indicate closing to the end of the range. This was chosen rather than
requiring callers of kern_close_range() to hold FILEDESC_SLOCK across the
call to kern_close_range for simplicity.
The flags argument of close_range(2) is currently unused, so any flags set
is currently EINVAL. It was added to the interface in Linux so that future
flags could be added for, e.g., "halt on first error" and things of this
nature.
This patch is based on a syscall of the same design that is expected to be
merged into Linux.
Reviewed by: kib, markj, vangyzen (all slightly earlier revisions)
Differential Revision: https://reviews.freebsd.org/D21627
Set up three vnet jails, bridged together. Run carp between two of them.
Attempt to provoke locking / epoch issues.
Reviewed by: mav (previous version), melifaro, asomers
Differential Revision: https://reviews.freebsd.org/D24303
vnode_fd and kqfd are both shared among multiple CU; define them exactly
once.
In the case of vnode_fd, it was simply the declaration that needed
correction.
-fno-common will become the default in GCC10/LLVM11.
MFC after: 3 days
Many rtsock tests verify the ordering of the kernel messages for the
particular event. In order to avoid flaky tests due to the other tests
running, switch all tests to use personal vnet-enabled jails.
This removes all clashes on the IP addresses and brings back the ability
to run these tests simultaneously.
Reported by: olivier
Reviewed by: olivier
Differential Revision: https://reviews.freebsd.org/D24182
- The linked list of cryptoini structures used in session
initialization is replaced with a new flat structure: struct
crypto_session_params. This session includes a new mode to define
how the other fields should be interpreted. Available modes
include:
- COMPRESS (for compression/decompression)
- CIPHER (for simply encryption/decryption)
- DIGEST (computing and verifying digests)
- AEAD (combined auth and encryption such as AES-GCM and AES-CCM)
- ETA (combined auth and encryption using encrypt-then-authenticate)
Additional modes could be added in the future (e.g. if we wanted to
support TLS MtE for AES-CBC in the kernel we could add a new mode
for that. TLS modes might also affect how AAD is interpreted, etc.)
The flat structure also includes the key lengths and algorithms as
before. However, code doesn't have to walk the linked list and
switch on the algorithm to determine which key is the auth key vs
encryption key. The 'csp_auth_*' fields are always used for auth
keys and settings and 'csp_cipher_*' for cipher. (Compression
algorithms are stored in csp_cipher_alg.)
- Drivers no longer register a list of supported algorithms. This
doesn't quite work when you factor in modes (e.g. a driver might
support both AES-CBC and SHA2-256-HMAC separately but not combined
for ETA). Instead, a new 'crypto_probesession' method has been
added to the kobj interface for symmteric crypto drivers. This
method returns a negative value on success (similar to how
device_probe works) and the crypto framework uses this value to pick
the "best" driver. There are three constants for hardware
(e.g. ccr), accelerated software (e.g. aesni), and plain software
(cryptosoft) that give preference in that order. One effect of this
is that if you request only hardware when creating a new session,
you will no longer get a session using accelerated software.
Another effect is that the default setting to disallow software
crypto via /dev/crypto now disables accelerated software.
Once a driver is chosen, 'crypto_newsession' is invoked as before.
- Crypto operations are now solely described by the flat 'cryptop'
structure. The linked list of descriptors has been removed.
A separate enum has been added to describe the type of data buffer
in use instead of using CRYPTO_F_* flags to make it easier to add
more types in the future if needed (e.g. wired userspace buffers for
zero-copy). It will also make it easier to re-introduce separate
input and output buffers (in-kernel TLS would benefit from this).
Try to make the flags related to IV handling less insane:
- CRYPTO_F_IV_SEPARATE means that the IV is stored in the 'crp_iv'
member of the operation structure. If this flag is not set, the
IV is stored in the data buffer at the 'crp_iv_start' offset.
- CRYPTO_F_IV_GENERATE means that a random IV should be generated
and stored into the data buffer. This cannot be used with
CRYPTO_F_IV_SEPARATE.
If a consumer wants to deal with explicit vs implicit IVs, etc. it
can always generate the IV however it needs and store partial IVs in
the buffer and the full IV/nonce in crp_iv and set
CRYPTO_F_IV_SEPARATE.
The layout of the buffer is now described via fields in cryptop.
crp_aad_start and crp_aad_length define the boundaries of any AAD.
Previously with GCM and CCM you defined an auth crd with this range,
but for ETA your auth crd had to span both the AAD and plaintext
(and they had to be adjacent).
crp_payload_start and crp_payload_length define the boundaries of
the plaintext/ciphertext. Modes that only do a single operation
(COMPRESS, CIPHER, DIGEST) should only use this region and leave the
AAD region empty.
If a digest is present (or should be generated), it's starting
location is marked by crp_digest_start.
Instead of using the CRD_F_ENCRYPT flag to determine the direction
of the operation, cryptop now includes an 'op' field defining the
operation to perform. For digests I've added a new VERIFY digest
mode which assumes a digest is present in the input and fails the
request with EBADMSG if it doesn't match the internally-computed
digest. GCM and CCM already assumed this, and the new AEAD mode
requires this for decryption. The new ETA mode now also requires
this for decryption, so IPsec and GELI no longer do their own
authentication verification. Simple DIGEST operations can also do
this, though there are no in-tree consumers.
To eventually support some refcounting to close races, the session
cookie is now passed to crypto_getop() and clients should no longer
set crp_sesssion directly.
- Assymteric crypto operation structures should be allocated via
crypto_getkreq() and freed via crypto_freekreq(). This permits the
crypto layer to track open asym requests and close races with a
driver trying to unregister while asym requests are in flight.
- crypto_copyback, crypto_copydata, crypto_apply, and
crypto_contiguous_subsegment now accept the 'crp' object as the
first parameter instead of individual members. This makes it easier
to deal with different buffer types in the future as well as
separate input and output buffers. It's also simpler for driver
writers to use.
- bus_dmamap_load_crp() loads a DMA mapping for a crypto buffer.
This understands the various types of buffers so that drivers that
use DMA do not have to be aware of different buffer types.
- Helper routines now exist to build an auth context for HMAC IPAD
and OPAD. This reduces some duplicated work among drivers.
- Key buffers are now treated as const throughout the framework and in
device drivers. However, session key buffers provided when a session
is created are expected to remain alive for the duration of the
session.
- GCM and CCM sessions now only specify a cipher algorithm and a cipher
key. The redundant auth information is not needed or used.
- For cryptosoft, split up the code a bit such that the 'process'
callback now invokes a function pointer in the session. This
function pointer is set based on the mode (in effect) though it
simplifies a few edge cases that would otherwise be in the switch in
'process'.
It does split up GCM vs CCM which I think is more readable even if there
is some duplication.
- I changed /dev/crypto to support GMAC requests using CRYPTO_AES_NIST_GMAC
as an auth algorithm and updated cryptocheck to work with it.
- Combined cipher and auth sessions via /dev/crypto now always use ETA
mode. The COP_F_CIPHER_FIRST flag is now a no-op that is ignored.
This was actually documented as being true in crypto(4) before, but
the code had not implemented this before I added the CIPHER_FIRST
flag.
- I have not yet updated /dev/crypto to be aware of explicit modes for
sessions. I will probably do that at some point in the future as well
as teach it about IV/nonce and tag lengths for AEAD so we can support
all of the NIST KAT tests for GCM and CCM.
- I've split up the exising crypto.9 manpage into several pages
of which many are written from scratch.
- I have converted all drivers and consumers in the tree and verified
that they compile, but I have not tested all of them. I have tested
the following drivers:
- cryptosoft
- aesni (AES only)
- blake2
- ccr
and the following consumers:
- cryptodev
- IPsec
- ktls_ocf
- GELI (lightly)
I have not tested the following:
- ccp
- aesni with sha
- hifn
- kgssapi_krb5
- ubsec
- padlock
- safe
- armv8_crypto (aarch64)
- glxsb (i386)
- sec (ppc)
- cesa (armv7)
- cryptocteon (mips64)
- nlmsec (mips64)
Discussed with: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D23677
Change type of variable used in setsocketopt so correct size of
option is passed.
Test failure was identified when running the test on PowerPC64,
and the following error message was seen:
"bind () failed: Address already in use"
Submitted by: Fernando Valle <fernando.valle@eldorado.org.br>
Reviewed by: melifaro, adalava
Approved by: jhibbits (mentor)
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D24164
The FUSE protocol allows the client (kernel) to cache a file's size, if the
server (userspace daemon) allows it. A well-behaved daemon obviously should
not change a file's size while a client has it cached. But a buggy daemon
might. If the kernel ever detects that that has happened, then it should
invalidate the entire cache for that file. Previously, we would not only
cache stale data, but in the case of a file extension while we had the size
cached, we accidentally extended the cache with zeros.
PR: 244178
Reported by: Ben RUBSON <ben.rubson@gmx.com>
Reviewed by: cem
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24012
Basic test case where we create a bridge loop, verify that we really are
looping and then enable spanning tree to resolve the loop.
Reviewed by: philip
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23959
We were reusing a structure for multiple operations, but failing to
reinitialize one member. The result is that a server that cares about FUSE
file handle IDs would see one correct FUSE_FSYNC operation, and one with the
FHID unset.
PR: 244431
Reported by: Agata <chogata@gmail.com>
MFC after: 2 weeks
Very basic bridge test: Set up two jails and test that they can pass IPv4
traffic over the bridge.
Reviewed by: melifaro, philip
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23697
As with the rest of pjdfstest, tag the symlink with package=tests.
The tests -> . symlink seems a little strange but that's independent
of pkgbase.
Sponsored by: The FreeBSD Foundation
kldload() returns a positive integer when it loads a ko, so check that the
return value is -1 to detect error cases, not that it's different from zero.
MFC after: 3 days
X-MFC-With: r357234
kldload() returns an error (EEXIST) if the module is already loaded.
That's not a problem for us, so ignore that error.
While here also clean up include statements.
MFC after: 3 days
X-MFC-With: r357234
The horrific GENRAND construction bent over backwards to construct 64-bit
signed integers from the 31-bit output of random(3) for about 20 numbers per
test. Reproducibility wasn't a goal: random(3) was seeded with
srandomdev(3). Speed is not a factor for generating 20 integers with
arc4random(3). Range is not a factor: all uses did not bound the range
beyond that of the full [INT64_MIN, INT64_MAX]. Just use arc4random(3).
Reported by: Coverity
CIDs: 1404809, 1404817, 1404838, 1404840 and about 6x other
identical reports of dubious code relating to the
construction
if_epair abused the ifr_data field to insert its second interface in
IFC_IFLIST. If userspace provides a value for ifr_data it would get
dereferenced by the kernel leading to a panic.
Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com>
MFC after: 3 days
The routing subdirectory installed into the same directory as the test tests,
which caused them to overwrite the net Kyuafile. As a result these tests were
not executed.
X-MFC-With: r356146
Redirect (and temporal) route expiration was broken a while ago.
This change brings route expiration back, with unified IPv4/IPv6 handling code.
It introduces net.inet.icmp.redirtimeout sysctl, allowing to set
an expiration time for redirected routes. It defaults to 10 minutes,
analogues with net.inet6.icmp6.redirtimeout.
Implementation uses separate file, route_temporal.c, as route.c is already
bloated with tons of different functions.
Internally, expiration is implemented as an per-rnh callout scheduled when
route with non-zero rt_expire time is added or rt_expire is changed.
It does not add any overhead when no temporal routes are present.
Callout traverses entire routing tree under wlock, scheduling expired routes
for deletion and calculating the next time it needs to be run. The rationale
for such implemention is the following: typically workloads requiring large
amount of routes have redirects turned off already, while the systems with
small amount of routes will not inhibit large overhead during tree traversal.
This changes also fixes netstat -rn display of route expiration time, which
has been broken since the conversion from kread() to sysctl.
Reviewed by: bz
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D23075