This is the culmination of about a week of work from three developers to
fix a number of functional and security issues. This patch consists of
work done by the following folks:
- Jason A. Donenfeld <Jason@zx2c4.com>
- Matt Dunwoodie <ncon@noconroy.net>
- Kyle Evans <kevans@FreeBSD.org>
Notable changes include:
- Packets are now correctly staged for processing once the handshake has
completed, resulting in less packet loss in the interim.
- Various race conditions have been resolved, particularly w.r.t. socket
and packet lifetime (panics)
- Various tests have been added to assure correct functionality and
tooling conformance
- Many security issues have been addressed
- if_wg now maintains jail-friendly semantics: sockets are created in
the interface's home vnet so that it can act as the sole network
connection for a jail
- if_wg no longer fails to remove peer allowed-ips of 0.0.0.0/0
- if_wg now exports via ioctl a format that is future proof and
complete. It is additionally supported by the upstream
wireguard-tools (which we plan to merge in to base soon)
- if_wg now conforms to the WireGuard protocol and is more closely
aligned with security auditing guidelines
Note that the driver has been rebased away from using iflib. iflib
poses a number of challenges for a cloned device trying to operate in a
vnet that are non-trivial to solve and adds complexity to the
implementation for little gain.
The crypto implementation that was previously added to the tree was a
super complex integration of what previously appeared in an old out of
tree Linux module, which has been reduced to crypto.c containing simple
boring reference implementations. This is part of a near-to-mid term
goal to work with FreeBSD kernel crypto folks and take advantage of or
improve accelerated crypto already offered elsewhere.
There's additional test suite effort underway out-of-tree taking
advantage of the aforementioned jail-friendly semantics to test a number
of real-world topologies, based on netns.sh.
Also note that this is still a work in progress; work going further will
be much smaller in nature.
MFC after: 1 month (maybe)
This should allow the test to pass in Jenkins. Testing it locally now
reports "passed" instead of "invalid TAP data".
While touching this file also fix some shellcheck warnings that were
pointed out by my IDE.
Reviewed By: lwhsu, afedorov
Differential Revision: https://reviews.freebsd.org/D29054
The argument has to be a single whitespace-separate value. While touching
all these lines also add ksh93, since `atf_set "require.progs"` overrides
the default value specified in the Kyuafile. This then results in tests
being executed despite ksh93 not being installed.
Reviewed By: asomers
Differential Revision: https://reviews.freebsd.org/D29066
It seems like GCC's -Wsign-compare is stricter and also warns for
constants. Appease GCC by adding the required casts.
Fixes: 96a9e50e63 ("ptrace_test: Add more debug output on test failures")
Reported by: Jenkins CI
This makes the `kyua report --verbose` output a lot easier to parse when
looking at failed tests. It also fixes the closefrom() test since I
tested my changes with this commit but forgot to push it together with
fa32350347.
Fixes: fa32350347 ("close_range: add audit support")
Instead of running tests one-by-one with the shell wrapper we now run
the full gtest testsuite twice (once as root, once as non root). This
significantly speeds up running tests despite running them twice.
This change also passes the missing -u flag to capsicum-test that caused
test failures (https://bugs.freebsd.org/250178)
Previously, running the testsuite with the wrapper script took ~3s per
test on aarch64 QEMU, i.e. a total of almost 5 minutes.
Now it takes 6 seconds to run all tests twice.
Before:
root@freebsd-aarch64:/usr/tests/sys/capsicum # /usr/bin/time kyua test functional
94/96 passed (2 failed)
309.97 real 58.46 user 244.31 sys
After:
root@freebsd-aarch64:/usr/tests/sys/capsicum # /usr/bin/time kyua test functional
functional:test_root -> passed [2.659s]
functional:test_unprivileged -> passed [2.391s]
2/2 passed (0 failed)
5.48 real 1.06 user 2.52 sys
This overhead is caused by kyua + atf-sh spawning lots of additional
processes and can be avoided by just running the googletest test binary.
syscall seconds calls errors
fork 39.810229456 1275 0
sigprocmask 13.546928736 572 0
i.e. 1275 processes spawned to run a single test.
Test Plan: All tests pass with D28907.
PR: 250178
Reviewed By: lwhsu
Differential Revision: https://reviews.freebsd.org/D29014
This includes various fixes that I submitted recently such as updating the
pdkill() tests for the actual implemented behaviour
(https://github.com/google/capsicum-test/pull/53) and lots of changes to
avoid calling sleep() and replacing it with reliable synchronization
(pull requests 49,51,52,53,54). This should make the testsuite more reliable
when running on Jenkins. Additionally, process status is now retrieved using
libprocstat instead of running `ps` and parsing the output
(https://github.com/google/capsicum-test/pull/50). This fixes one previously
failing test and speeds up execution.
Overall, this update reduces the total runtime from ~60s to about 4-5 seconds.
ATF now opens the results file (without CLOEXEC), so the child actually
has a valid file descriptor 3. To fix this simply use a large number that
will definitely not be a valid file descriptor.
Reviewed by: jhb, cem, lwhsu
Differential Revision: https://reviews.freebsd.org/D28889
I've run these tests many times in a loop on multiple architectures and
it works reliably for me, maybe it's time to retire these skips?
This also adds an additional waitpid to one of the tests to avoid
a potential race condition (suggested by markj@).
PR: 239397, 244056, 239425, 240510, 220841, 243605
Reviewed By: markj
Differential Revision: https://reviews.freebsd.org/D28888
Mostly automatic, using
`CHILD_REQUIRE\(([^|&\n]*) ==` -> `CHILD_REQUIRE_EQ_INT($1,`
`ATF_REQUIRE\(([^|&\n]*) ==` -> `REQUIRE_EQ_INT($1,` followed by
git-clang-format -f and then manually checking ones that contain ||/&&.
Test Plan:
Still getting the same failure but now it prints
`psr.sr_error (0) == EBADF (9) not met` instead of just failing
without printing the values.
PR: 243605
Reviewed By: jhb
Differential Revision: https://reviews.freebsd.org/D28887
This also fixes a typo in the dup test that caused the head function to
not be called. On my test system without python3 the tests are now
skipped instead of failing.
Reviewed By: kp
Differential Revision: https://reviews.freebsd.org/D28903
Ensure that we not only block on some interfaces, but also forward on
some. Without the previous commit we wound up discarding on all ports,
rather than only on the ports needed to break the loop.
MFC after: 1 week
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D28917
All supported platforms support thread-local vars and __thread.
Reviewed by: emaste
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28796
This is basically the same test as the existing STP test, but now on top
of VLAN interfaces instead of directly using the epair devices.
MFC after: 1 week
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D28861
df093aa946 linked against libprivateauditd.a, but that is currently
(and incorrectly) built as position-dependent. For now just force PIE
off for this test to fix the WITH_PIE build.
Sponsored by: The FreeBSD Foundation
In the CheriBSD CI we reproducibly see the first test in sys/audit
(administrative:acct_failure) fail due to a missing startup message.
It appears this is caused by a race condition when starting auditd:
`service auditd onestart` returns as soon as the initial auditd() parent
exits (after the daemon(3) call).
We can avoid this problem by setting up the auditd infrastructure
in-process: libauditd contains audit_quick_{start,stop}() functions that
look like they are ideally suited to this task.
This patch also avoids forking lots of shell processes for each of the 418
tests by using `auditon(A_SENDTRIGGER, &trigger, sizeof(trigger))` to check
for a running auditd(8) instead of using `service auditd onestatus`.
With these two changes (and D28388 to fix the XFAIL'd test) I can now
boot and run `cd /usr/tests/sys/audit && kyua test` without any failures
in a single-core QEMU instance. Before there would always be at least one
failed test.
Besides making the tests more reliable in CI, a nice side-effect of this
change is that it also significantly speeds up running them by avoiding
lots of fork()/execve() caused by shell scripts:
Running kyua test on an AArch64 QEMU took 315s before and now takes 68s,
so it's roughly 3.5 times faster. This effect is even larger when running
on a CHERI-RISC-V QEMU since emulating CHERI instructions on an x86 host
is noticeably slower than emulating AArch64.
Test Plan: aarch64+amd64 QEMU no longer fail.
Reviewed By: asomers
Differential Revision: https://reviews.freebsd.org/D28451
If we install the scapy package (which we do list as a dependency) we
don't automatically install python (but we do have python3).
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC (“Netgate”’)
Traditionally routing socket code did almost zero checks on
the input message except for the most basic size checks.
This resulted in the unclear KPI boundary for the routing system code
(`rtrequest*` and now `rib_action()`) w.r.t message validness.
Multiple potential problems and nuances exists:
* Host bits in RTAX_DST sockaddr. Existing applications do send prefixes
with hostbits uncleared. Even `route(8)` does this, as they hope the kernel
would do the job of fixing it. Code inside `rib_action()` needs to handle
it on its own (see `rt_maskedcopy()` ugly hack).
* There are multiple way of adding the host route: it can be DST without
netmask or DST with /32(/128) netmask. Also, RTF_HOST has to be set correspondingly.
Currently, these 2 options create 2 DIFFERENT routes in the kernel.
* no sockaddr length/content checking for the "secondary" fields exists: nothing
stops rtsock application to send sockaddr_in with length of 25 (instead of 16).
Kernel will accept it, install to RIB as is and propagate to all rtsock consumers,
potentially triggering bugs in their code. Same goes for sin_port, sin_zero, etc.
The goal of this change is to make rtsock verify all sockaddr and prefix consistency.
Said differently, `rib_action()` or internals should NOT require to change any of the
sockaddrs supplied by `rt_addrinfo` structure due to incorrectness.
To be more specific, this change implements the following:
* sockaddr cleanup/validation check is added immediately after getting sockaddrs from rtm.
* Per-family dst/netmask checks clears host bits in dst and zeros all dst/netmask "secondary" fields.
* The same netmask checking code converts /32(/128) netmasks to "host" route case
(NULL netmask, RTF_HOST), removing the dualism.
* Instead of allowing ANY "known" sockaddr families (0<..<AF_MAX), allow only actually
supported ones (inet, inet6, link).
* Automatically convert `sockaddr_sdl` (AF_LINK) gateways to
`sockaddr_sdl_short`.
Reported by: Guy Yur <guyyur at gmail.com>
Reviewed By: donner
Differential Revision: https://reviews.freebsd.org/D28668
MFC after: 3 days
This allows d_off to be used with lseek to position the file so that
getdirentries(2) will return the next entry. It is not used by
readdir(3).
PR: 253411
Reported by: John Millikin <jmillikin@gmail.com>
Reviewed by: cem
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28605
I changed the Makefile to use SRCS instead of LDADD, but since there is
still and absolute path to the source the .o file was created inside the
source directory instead of the build directory.
It would be nice if this was an error/warning by default, but for now just
fix this issue by using .PATH and the base name of the file.
Reported by: cy, peterj
to be a true RFC 6598 NAT444 setup, where each network segment (e.g. user,
subnet) can have their own dedicated port aliasing ranges.
Reviewed by: donner, kp
Approved by: 0mp (mentor), donner, kp
Differential Revision: https://reviews.freebsd.org/D23450
In the CheriBSD CI, we run the testsuite with /tmp as tmpfs. This causes
the extattr audit tests to fail since tmpfs does not (yet) support
extattrs. Skip those tests if the target path is on a file system that
does not support extended file attributes.
While touching these two files also convert the ATF_REQUIRE_EQ(-1, ...)
checks to use ATF_REQURIE_ERRNO().
Reviewed By: asomers
Differential Revision: https://reviews.freebsd.org/D28392
This avoids a SIGILL when running these tests on QEMU (which
defaults to a basic amd64 CPU without SSE4.2).
This commit also tests the table-based implementations in addition to
testing the hw-accelerated crc32 versions.
Reviewed By: cem, kib, markj
Differential Revision: https://reviews.freebsd.org/D28395
This changes the behaviour to a 30s total timeout (needed when running
on slow emulated uniprocessor systems) and timing out after 10s without
any input. This also uses timespecsub() instead of ignoring the
nanoseconds field.
After this change the tests runs more reliably on QEMU and time out less
frequently.
Reviewed By: asomers
Differential Revision: https://reviews.freebsd.org/D28391
rtinit[1]() is a function used to add or remove interface address prefix routes,
similar to ifa_maintain_loopback_route().
It was intended to be family-agnostic. There is a problem with this approach
in reality.
1) IPv6 code does not use it for the ifa routes. There is a separate layer,
nd6_prelist_(), providing interface for maintaining interface routes. Its part,
responsible for the actual route table interaction, mimics rtenty() code.
2) rtinit tries to combine multiple actions in the same function: constructing
proper route attributes and handling iterations over multiple fibs, for the
non-zero net.add_addr_allfibs use case. It notably increases the code complexity.
3) dstaddr handling. flags parameter re-uses RTF_ flags. As there is no special flag
for p2p connections, host routes and p2p routes are handled in the same way.
Additionally, mapping IFA flags to RTF flags makes the interface pretty messy.
It make rtinit() to clash with ifa_mainain_loopback_route() for IPV4 interface
aliases.
4) rtinit() is the last customer passing non-masked prefixes to rib_action(),
complicating rib_action() implementation.
5) rtinit() coupled ifa announce/withdrawal notifications, producing "false positive"
ifa messages in certain corner cases.
To address all these points, the following has been done:
* rtinit() has been split into multiple functions:
- Route attribute construction were moved to the per-address-family functions,
dealing with (2), (3) and (4).
- funnction providing net.add_addr_allfibs handling and route rtsock notificaions
is the new routing table inteface.
- rtsock ifa notificaion has been moved out as well. resulting set of funcion are only
responsible for the actual route notifications.
Side effects:
* /32 alias does not result in interface routes (/32 route and "host" route)
* RTF_PINNED is now set for IPv6 prefixes corresponding to the interface addresses
Differential revision: https://reviews.freebsd.org/D28186
Previously, we would accept any kind of LIO_* opcode, including ones
that were intended for in-kernel use only like LIO_SYNC (which is not
defined in userland). The situation became more serious with
022ca2fc7f. After that revision, setting
aio_lio_opcode to LIO_WRITEV or LIO_READV would trigger an assertion.
Note that POSIX does not specify what should happen if aio_lio_opcode is
invalid.
MFC-with: 022ca2fc7f
Reviewed by: jhb, tmunro, 0mp
Differential Revision: <https://reviews.freebsd.org/D28078
aio_fsync(O_DSYNC, ...) is the asynchronous version of fdatasync(2).
Reviewed by: kib, asomers, jhb
Differential Review: https://reviews.freebsd.org/D25071
POSIX AIO is great, but it lacks vectored I/O functions. This commit
fixes that shortcoming by adding aio_writev and aio_readv. They aren't
part of the standard, but they're an obvious extension. They work just
like their synchronous equivalents pwritev and preadv.
It isn't yet possible to use vectored aiocbs with lio_listio, but that
could be added in the future.
Reviewed by: jhb, kib, bcr
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D27743
This updates the FUSE protocol to 7.28, though most of the new features
are optional and are not yet implemented.
MFC after: 2 weeks
Relnotes: yes
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D27818
An order-of-operations problem caused an expectation intended for
FUSE_READ to instead match FUSE_ACCESS. Surprisingly, only one test
case was affected.
MFC after: 2 weeks
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D27818
FUSE_LSEEK reports holes on fuse file systems, and is used for example
by bsdtar.
MFC after: 2 weeks
Relnotes: yes
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D27804
maxphys is now a tunable, ever since r368124. The default value is also
larger than it used to be. That broke several fusefs tests that made
assumptions about maxphys.
* WriteCluster.clustering used the MAXPHYS compile-time constant.
* WriteBackAsync.direct_io_partially_overlaps_cached_block implicitly
depended on the default value of maxphys. Fix it by making the
dependency explicit.
* Write.write_large implicitly assumed that maxphys would be no more
than twice maxbcachebuf. Fix it by explicitly setting m_max_write.
* WriteCluster.clustering and several others failed because the MockFS
module did not work for max_write > 128KB (which most tests would set
when maxphys > 256KB). Limit max_write accordingly. This is the same
as fusefs-libs's behavior.
* Bmap's tests were originally written for MAXPHYS=128KB. With larger
values, the simulated file size was too small.
PR: 252096
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D27769
It turns out pf incorrectly updates the TCP checksum if the TCP option
we're modifying is not 2-byte algined with respect to the start of the
packet.
Create a TCP packet with such an option and throw it through a scrub
rule, which will update timestamps and modify the packet.
PR: 240416
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27688
Macfilter to route packets through different hooks based on sender MAC address.
Based on ng_macfilter written by Pekka Nikander
Sponsered by Retina b.v.
Reviewed by: afedorov
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D27268