Main part is that kern_copyin on amd64 after LA57 should query the top
of UVA for correct operations. In fact it should started doing that
after the workaround for AMD bug with IRET in the last user page was
fixed by reducing UVA by a page.
Also since we started calculating top of UVA, fix MIPS according to
the comment.
Reported by: lwhsu
PR: 248933
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D26312
Messing with gnop devices under a zpool fails in this test, causing
the pool to be suspended and eventually the system to deadlock.
Skip the test for now until the issue is resolved.
PR: tests/248910
Discussed with: lwhsu
Sponsored by: iXsystems, Inc.
The primary benefit is maintaining a completely shared
code base with the community allowing FreeBSD to receive
new features sooner and with less effort.
I would advise against doing 'zpool upgrade'
or creating indispensable pools using new
features until this change has had a month+
to soak.
Work on merging FreeBSD support in to what was
at the time "ZFS on Linux" began in August 2018.
I first publicly proposed transitioning FreeBSD
to (new) OpenZFS on December 18th, 2018. FreeBSD
support in OpenZFS was finally completed in December
2019. A CFT for downstreaming OpenZFS support in
to FreeBSD was first issued on July 8th. All issues
that were reported have been addressed or, for
a couple of less critical matters there are
pull requests in progress with OpenZFS. iXsystems
has tested and dogfooded extensively internally.
The TrueNAS 12 release is based on OpenZFS with
some additional features that have not yet made
it upstream.
Improvements include:
project quotas, encrypted datasets,
allocation classes, vectorized raidz,
vectorized checksums, various command line
improvements, zstd compression.
Thanks to those who have helped along the way:
Ryan Moeller, Allan Jude, Zack Welch, and many
others.
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D25872
RTF_HOST indicates whether route is a host route
(netmask is empty or /{32,128}).
Check that if netmask is empty and host route is not specified, kernel
returns an error.
Differential Revision: https://reviews.freebsd.org/D26155
Thanks to r364064, the name cache now returns a hit where previously it
would miss. Adjust the expectations accordingly.
PR: 248583
Reported by: lwhsu
MFC with: r364064
This fixes possible link errors, similar to:
ld: error: undefined symbol: iface_setup_addr
>>> referenced by test_rtsock_l3.c:111 (tests/sys/net/routing/test_rtsock_l3.c:111)
>>> test_rtsock_l3.o:(presetup_ipv4)
>>> referenced by test_rtsock_l3.c:79 (tests/sys/net/routing/test_rtsock_l3.c:79)
>>> test_rtsock_l3.o:(presetup_ipv6)
>>> referenced by test_rtsock_l3.c:512 (tests/sys/net/routing/test_rtsock_l3.c:512)
>>> test_rtsock_l3.o:(atfu_rtm_change_v4_gw_success_body)
>>> referenced 10 more times
In C (not C++), 'naked' inline is almost always a mistake. Either use
static inline (this is appropriate for most cases), or extern inline.
MFC after: 3 days
This avoids injecting errors into the test system's mirrors.
gnop seems like a good solution here but it injects errors at the wrong
place vs where these tests expect and does not support a 'max global count'
like the failpoints do with 'n*' syntax.
Reviewed by: cem, vangyzen
Sponsored by: Dell EMC Isilon
Prior to this change a `SF_IMMUTABLE` chflagsat(2)'ed file (`path`) was left
behind, which sabotaged kyua(1) from being able to clean up the work directory,
This resulted in unnecessary work for folks having to clean up the work
directory on non-disposable systems, which defaults to `/tmp`. Use `UF_OFFLINE`
instead of `SF_IMMUTABLE`, in part because setting `SF_IMMUTABLE` isn't relevant
to the test and `SF_IMMUTABLE` cannot be cleared at all securelevels, as pointed
out by @asomers.
Additional work is required to catch cases like this upfront in the future to
avoid tester headache. See PR # 247765 for more details/followup.
Suggested by: asomers
Reviewed By: asomers, #tests
MFC after: 1 week
PR: 247761
Sponsored by: DellEMC
Differential Revision: https://reviews.freebsd.org/D25561
memfd_create fds will no longer require an ftruncate(2) to set the size;
they'll grow (to the extent that it's possible) upon write(2)-like syscalls.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D25502
geli does all of its crypto operations in a separate thread pool, so
g_eli_start, g_eli_read_done, and g_eli_write_done don't actually do very
much work. Enabling direct dispatch eliminates the g_up/g_down bottlenecks,
doubling IOPs on my system. This change does not affect the thread pool.
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: Axcient
Differential Revision: https://reviews.freebsd.org/D25587
This test should no longer provoke large amounts of traffic, which can
overwhelm single-core systems, preventing them from making progress in the
tests.
The test can now be re-enabled.
PR: 246448
Enable STP before bringing the bridges up. This avoids a switching loop,
which has a tendency to drown out progress in userspace processes,
especially on single-core systems.
Only check that we have indeed shut down one of the looped interfaces
PR: 246448
Reviewed by: melifaro
Differential Revision: https://reviews.freebsd.org/D25084
Fix the netinet/netinet6 divert tests falsely reporting 'ipdivert module is
not loaded' when the divert module is built into the kernel
Sponsored by: Axiado
Differential Revision: https://reviews.freebsd.org/D25026
This patch fixes two issues relating to FUSE_ACCESS when the
default_permissions mount option is disabled:
* VOP_ACCESS() calls with VADMIN set should never be sent to a fuse server
in the form of FUSE_ACCESS operations. The FUSE protocol has no equivalent
of VADMIN, so we must evaluate such things kernel-side, regardless of the
default_permissions setting.
* The FUSE protocol only requires FUSE_ACCESS to be sent for two purposes:
for the access(2) syscall and to check directory permissions for
searchability during lookup. FreeBSD sends it much more frequently, due to
differences between our VFS and Linux's, for which FUSE was designed. But
this patch does eliminate several cases not required by the FUSE protocol:
* for any FUSE_*XATTR operation
* when creating a new file
* when deleting a file
* when setting timestamps, such as by utimensat(2).
* Additionally, when default_permissions is disabled, this patch removes one
FUSE_GETATTR operation when deleting a file.
PR: 245689
Reported by: MooseFS FreeBSD Team <freebsd@moosefs.pro>
Reviewed by: cem
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24777
When a FUSE operation other than LOOKUP returns ENOENT, the kernel will
reclaim that vnode, resuling in a FUSE_FORGET being sent a short while
later. Many of the ENOENT tests weren't expecting those FUSE_FORGET
operations. They usually passed by luck since FUSE_FORGET is often delayed.
This commit adds appropriate expectations.
MFC after: 2 weeks
common_init_tbl is only used within this single CU, so it should be marked
static.
WARNS=6 also complained about the var defined by
`ATF_TC_WITH_CLEANUP(getastats);` being unused, which turns out to be
because it's not been hooked up in ATF_TP_ADD_TCS. kp@ did not immediately
recall any reason for this, and the case passes on my local system, so hook
it up.
Note that I've not yet set WARNS= 6 here. Investigation is underway to see
if we can feasibly default WARNS to 6 for src builds to catch directories
too deep to inherit a WARNS from the top-level subdirectories' Makefile.inc.
Those particular WARNS settings will be subsequently removed as they become
redundant with a more-global default.
MFC after: 1 week
The test makefiles will handle setting mode bits during install. Also,
Phabricator gets upset when uploading an executable plain-text file
without a shebang.
MFC after: 1 week
These two errors have been present since the tests' introduction.
Coincidentally every test (I think there's only one) that cares about that
field also works when the field's value is 0.
MFC after: 2 weeks
This test uses a gnop feature (delay probability) that isn't available on
stable/12. But it's unnecessary; the test works fine without it. Removing
it simplifies the test and, once MFCed, will allow it to pass on stable/12.
PR: 244158
Reported by: lwhsu
MFC after: 2 weeks
mac_bsdextended(4), when enabled, causes ordinary operations to send many
more VOP_GETATTRs to file system. The fusefs tests expectations aren't
written with those in mind. Optionally expecting them would greatly
obfuscate the fusefs tests. Worse, certain fusefs functionality (like
attribute caching) would be impossible to test if the tests couldn't expect
an exact number of GETATTR operations.
This commit resolves that conflict by making two changes:
1. The fusefs tests will now check for mac_bsdextended, and skip if it's
enabled.
2. The mac_bsdextended tests will now check whether the module is enabled, not
merely loaded. If it's loaded but disabled, the tests will automatically
enable it for the duration of the tests.
With these changes, a CI system can achieve best coverage by loading both
fusefs and mac_bsdextended at boot, and setting
security.mac.bsdextended.enabled=0
PR: 244229
Reported by: lwhsu
Reviewed by: cem
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24577
This removes support for the following algorithms:
- ARC4
- Blowfish
- CAST128
- DES
- 3DES
- MD5-HMAC
- Skipjack
Since /dev/crypto no longer supports 3DES, stop testing the 3DES KAT
vectors in cryptotest.py.
Reviewed by: cem (previous version)
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24346
We used to have an issue with recursive locking with
net.link.bridge.inherit_mac. This causes us to send an ARP request while
we hold the BRIDGE_LOCK, which used to cause us to acquire the
BRIDGE_LOCK again. We can't re-acquire it, so this caused a panic.
Now that we no longer need to acquire the BRIDGE_LOCK for
bridge_transmit() this should no longer panic. Test this.
PR: 216510
Reviewed by: emaste, philip
MFC after: 2 months
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D24251
The new tests have more complete setup and cleanup, are more granular, and
correctly annotate expected failures and skipped tests. A follow-up commit
will resolve a conflict with the fusefs tests (bug 244229).
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24257
- Lookup device drivers to test by name instead of assuming that the
software / hardware flags will select specific drivers.
- Set the sysctl to permit software /dev/crypto requests when testing
the accelerated software blake2 driver.
PR: 245825
Reported by: lwhsu
Reviewed by: cem, lwhsu
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24540
There were ultimately two separate problems here:
- a 32-bit long cannot represent microseconds since 1970 (noted by ian)
- time_t is 32-bit on i386, so now() was wrong anyways even with the correct
return type.
For the first, just explicitly use a uint64_t for now() and all of the
callers. For the second, we need to explicitly cast tv_sec to uint64_t
before it gets multiplied in the SEC_TO_US macro. Casting this instance
rather than generally in the macro was arbitrarily chosen simply because all
other uses are converting small relative time values.
The tests now pass on i386, at least; presumably other ILP32 will be fine
now as well.
We used to have a problem where bridges created in different vnet jails
would end up having the same mac address. This is now fixed by
including the jail name as a seed for the mac address generation, but we
should verify that it doesn't regress.
Originally noticed while attempting to run the kqueue tests under
qemu-user-static, this apparently just happens sometimes when running in a
jail in general -- the timer will fire off "too early," but it's really just
the result of imprecise measurements (noted by cem).
Kicking this over to NOTE_USECONDS still tests the correct thing while
allowing it to work more consistently; a basic sanity test reveals that we
often end up coming in just less than 200 microseconds after the timer
fired off.
MFC after: 3 days
closefrom has been converted to close_range internally; remediation is
underway for this, marking it as an expected fail for now while proper
course is determined.
PR: 245625
Similar to mmap'ing vnodes, posixshm should count any mapping where maxprot
contains VM_PROT_WRITE (i.e. fd opened r/w with no write-seal applied) as
writable and thus blocking of any write-seal.
The memfd tests have been amended to reflect the fixes here, which notably
includes:
1. Fix for error return bug; EPERM is not a documented failure mode for mmap
2. Fix rejection of write-seal with active mappings that can be upgraded via
mprotect(2).
Reported by: markj
Discussed with: markj, kib
close_range will clamp the range between [0, fdp->fd_lastfile], but failed
to take into account that fdp->fd_lastfile can become -1 if all fds are
closed. =-( In this scenario, just return because there's nothing further we
can do at the moment.
Add a test case for this, fork() and simply closefrom(0) twice in the child;
on the second invocation, fdp->fd_lastfile == -1 and will trigger a panic
before this change.
X-MFC-With: r359836
close_range(min, max, flags) allows for a range of descriptors to be
closed. The Python folk have indicated that they would much prefer this
interface to closefrom(2), as the case may be that they/someone have special
fds dup'd to higher in the range and they can't necessarily closefrom(min)
because they don't want to hit the upper range, but relocating them to lower
isn't necessarily feasible.
sys_closefrom has been rewritten to use kern_close_range() using ~0U to
indicate closing to the end of the range. This was chosen rather than
requiring callers of kern_close_range() to hold FILEDESC_SLOCK across the
call to kern_close_range for simplicity.
The flags argument of close_range(2) is currently unused, so any flags set
is currently EINVAL. It was added to the interface in Linux so that future
flags could be added for, e.g., "halt on first error" and things of this
nature.
This patch is based on a syscall of the same design that is expected to be
merged into Linux.
Reviewed by: kib, markj, vangyzen (all slightly earlier revisions)
Differential Revision: https://reviews.freebsd.org/D21627
Set up three vnet jails, bridged together. Run carp between two of them.
Attempt to provoke locking / epoch issues.
Reviewed by: mav (previous version), melifaro, asomers
Differential Revision: https://reviews.freebsd.org/D24303
vnode_fd and kqfd are both shared among multiple CU; define them exactly
once.
In the case of vnode_fd, it was simply the declaration that needed
correction.
-fno-common will become the default in GCC10/LLVM11.
MFC after: 3 days
Many rtsock tests verify the ordering of the kernel messages for the
particular event. In order to avoid flaky tests due to the other tests
running, switch all tests to use personal vnet-enabled jails.
This removes all clashes on the IP addresses and brings back the ability
to run these tests simultaneously.
Reported by: olivier
Reviewed by: olivier
Differential Revision: https://reviews.freebsd.org/D24182
- The linked list of cryptoini structures used in session
initialization is replaced with a new flat structure: struct
crypto_session_params. This session includes a new mode to define
how the other fields should be interpreted. Available modes
include:
- COMPRESS (for compression/decompression)
- CIPHER (for simply encryption/decryption)
- DIGEST (computing and verifying digests)
- AEAD (combined auth and encryption such as AES-GCM and AES-CCM)
- ETA (combined auth and encryption using encrypt-then-authenticate)
Additional modes could be added in the future (e.g. if we wanted to
support TLS MtE for AES-CBC in the kernel we could add a new mode
for that. TLS modes might also affect how AAD is interpreted, etc.)
The flat structure also includes the key lengths and algorithms as
before. However, code doesn't have to walk the linked list and
switch on the algorithm to determine which key is the auth key vs
encryption key. The 'csp_auth_*' fields are always used for auth
keys and settings and 'csp_cipher_*' for cipher. (Compression
algorithms are stored in csp_cipher_alg.)
- Drivers no longer register a list of supported algorithms. This
doesn't quite work when you factor in modes (e.g. a driver might
support both AES-CBC and SHA2-256-HMAC separately but not combined
for ETA). Instead, a new 'crypto_probesession' method has been
added to the kobj interface for symmteric crypto drivers. This
method returns a negative value on success (similar to how
device_probe works) and the crypto framework uses this value to pick
the "best" driver. There are three constants for hardware
(e.g. ccr), accelerated software (e.g. aesni), and plain software
(cryptosoft) that give preference in that order. One effect of this
is that if you request only hardware when creating a new session,
you will no longer get a session using accelerated software.
Another effect is that the default setting to disallow software
crypto via /dev/crypto now disables accelerated software.
Once a driver is chosen, 'crypto_newsession' is invoked as before.
- Crypto operations are now solely described by the flat 'cryptop'
structure. The linked list of descriptors has been removed.
A separate enum has been added to describe the type of data buffer
in use instead of using CRYPTO_F_* flags to make it easier to add
more types in the future if needed (e.g. wired userspace buffers for
zero-copy). It will also make it easier to re-introduce separate
input and output buffers (in-kernel TLS would benefit from this).
Try to make the flags related to IV handling less insane:
- CRYPTO_F_IV_SEPARATE means that the IV is stored in the 'crp_iv'
member of the operation structure. If this flag is not set, the
IV is stored in the data buffer at the 'crp_iv_start' offset.
- CRYPTO_F_IV_GENERATE means that a random IV should be generated
and stored into the data buffer. This cannot be used with
CRYPTO_F_IV_SEPARATE.
If a consumer wants to deal with explicit vs implicit IVs, etc. it
can always generate the IV however it needs and store partial IVs in
the buffer and the full IV/nonce in crp_iv and set
CRYPTO_F_IV_SEPARATE.
The layout of the buffer is now described via fields in cryptop.
crp_aad_start and crp_aad_length define the boundaries of any AAD.
Previously with GCM and CCM you defined an auth crd with this range,
but for ETA your auth crd had to span both the AAD and plaintext
(and they had to be adjacent).
crp_payload_start and crp_payload_length define the boundaries of
the plaintext/ciphertext. Modes that only do a single operation
(COMPRESS, CIPHER, DIGEST) should only use this region and leave the
AAD region empty.
If a digest is present (or should be generated), it's starting
location is marked by crp_digest_start.
Instead of using the CRD_F_ENCRYPT flag to determine the direction
of the operation, cryptop now includes an 'op' field defining the
operation to perform. For digests I've added a new VERIFY digest
mode which assumes a digest is present in the input and fails the
request with EBADMSG if it doesn't match the internally-computed
digest. GCM and CCM already assumed this, and the new AEAD mode
requires this for decryption. The new ETA mode now also requires
this for decryption, so IPsec and GELI no longer do their own
authentication verification. Simple DIGEST operations can also do
this, though there are no in-tree consumers.
To eventually support some refcounting to close races, the session
cookie is now passed to crypto_getop() and clients should no longer
set crp_sesssion directly.
- Assymteric crypto operation structures should be allocated via
crypto_getkreq() and freed via crypto_freekreq(). This permits the
crypto layer to track open asym requests and close races with a
driver trying to unregister while asym requests are in flight.
- crypto_copyback, crypto_copydata, crypto_apply, and
crypto_contiguous_subsegment now accept the 'crp' object as the
first parameter instead of individual members. This makes it easier
to deal with different buffer types in the future as well as
separate input and output buffers. It's also simpler for driver
writers to use.
- bus_dmamap_load_crp() loads a DMA mapping for a crypto buffer.
This understands the various types of buffers so that drivers that
use DMA do not have to be aware of different buffer types.
- Helper routines now exist to build an auth context for HMAC IPAD
and OPAD. This reduces some duplicated work among drivers.
- Key buffers are now treated as const throughout the framework and in
device drivers. However, session key buffers provided when a session
is created are expected to remain alive for the duration of the
session.
- GCM and CCM sessions now only specify a cipher algorithm and a cipher
key. The redundant auth information is not needed or used.
- For cryptosoft, split up the code a bit such that the 'process'
callback now invokes a function pointer in the session. This
function pointer is set based on the mode (in effect) though it
simplifies a few edge cases that would otherwise be in the switch in
'process'.
It does split up GCM vs CCM which I think is more readable even if there
is some duplication.
- I changed /dev/crypto to support GMAC requests using CRYPTO_AES_NIST_GMAC
as an auth algorithm and updated cryptocheck to work with it.
- Combined cipher and auth sessions via /dev/crypto now always use ETA
mode. The COP_F_CIPHER_FIRST flag is now a no-op that is ignored.
This was actually documented as being true in crypto(4) before, but
the code had not implemented this before I added the CIPHER_FIRST
flag.
- I have not yet updated /dev/crypto to be aware of explicit modes for
sessions. I will probably do that at some point in the future as well
as teach it about IV/nonce and tag lengths for AEAD so we can support
all of the NIST KAT tests for GCM and CCM.
- I've split up the exising crypto.9 manpage into several pages
of which many are written from scratch.
- I have converted all drivers and consumers in the tree and verified
that they compile, but I have not tested all of them. I have tested
the following drivers:
- cryptosoft
- aesni (AES only)
- blake2
- ccr
and the following consumers:
- cryptodev
- IPsec
- ktls_ocf
- GELI (lightly)
I have not tested the following:
- ccp
- aesni with sha
- hifn
- kgssapi_krb5
- ubsec
- padlock
- safe
- armv8_crypto (aarch64)
- glxsb (i386)
- sec (ppc)
- cesa (armv7)
- cryptocteon (mips64)
- nlmsec (mips64)
Discussed with: cem
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D23677
Change type of variable used in setsocketopt so correct size of
option is passed.
Test failure was identified when running the test on PowerPC64,
and the following error message was seen:
"bind () failed: Address already in use"
Submitted by: Fernando Valle <fernando.valle@eldorado.org.br>
Reviewed by: melifaro, adalava
Approved by: jhibbits (mentor)
Sponsored by: Eldorado Research Institute (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D24164
The FUSE protocol allows the client (kernel) to cache a file's size, if the
server (userspace daemon) allows it. A well-behaved daemon obviously should
not change a file's size while a client has it cached. But a buggy daemon
might. If the kernel ever detects that that has happened, then it should
invalidate the entire cache for that file. Previously, we would not only
cache stale data, but in the case of a file extension while we had the size
cached, we accidentally extended the cache with zeros.
PR: 244178
Reported by: Ben RUBSON <ben.rubson@gmx.com>
Reviewed by: cem
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D24012
Basic test case where we create a bridge loop, verify that we really are
looping and then enable spanning tree to resolve the loop.
Reviewed by: philip
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D23959
We were reusing a structure for multiple operations, but failing to
reinitialize one member. The result is that a server that cares about FUSE
file handle IDs would see one correct FUSE_FSYNC operation, and one with the
FHID unset.
PR: 244431
Reported by: Agata <chogata@gmail.com>
MFC after: 2 weeks