Several large data structures are allocated by fsck_ffs to track
resource usage. Most but not all were deallocated at the end of
checking each filesystem. This commit consolidates the freeing
of all data structures in one place and adds one that had previously
been missing.
It is important to clean up these data structures as they can be
large. If the previous allocations have not been freed, fsck_ffs
can run out of address space when many large filesystems are being
checked. An alternative would be to fork a new instance of fsck_ffs
for each filesystem to be checked, but we choose to free the small
set of large structures to save the fork overhead.
Reported by: Chuck Silvers
Tested by: Chuck Silvers
MFC after: 7 days
Sponsored by: Netflix
Vinum is a Logical Volume Manager that was introduced in FreeBSD 3.0,
and for FreeBSD 5 was ported to geom(4) as gvinum. gvinum has had no
specific development at least as far back as 2010, and has a number of
known bugs which are unlikely to be resolved.
Add a deprecation notice to raise awareness but state that vinum "may
not be" available in FreeBSD 14. Either it will be removed and the
notice will be updated to "is not" available, or someone will step up
to fix issues and maintain it and we will remove the notice.
Reviewed by: imp (earlier version)
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29424
This fixes a long-standing but very obscure bug in fsck_ffs when
it is run with the -R (rerun after unexpected errors). It only
occurs if fsck_ffs finds duplicate blocks and they are all contained
in inodes that reside in the first block of inodes (typically among
the first 128 inodes).
Rather than use the usual ginode() interface to walk through the
inodes in pass1, there is a special optimized `getnextinode()'
routine for walking through all the inodes. It has its own private
buffer for reading the inode blocks. If pass 1 finds duplicate
blocks it runs pass 1b to find all the inodes that contain these
duplicate blocks. Pass 1b also uses the `getnextinode()' to search
for the inodes with duplicate blocks. Pass 1b stops when all the
duplicate blocks have been found. If all the duplicate blocks are
found in the first block of inodes, then the getnextinode cache
holds this block of bad inodes. The subsequent cleanup of the inodes
in passes 2-5 is done using ginode() which uses the regular fsck_ffs
cache.
When fsck_ffs restarts, pass1() calls setinodebuf() to point at the
first block of inodes. When it calls getnextinode() to get inode
2, getnextino() sees that its private cache already has the first
set of inodes loaded and starts using them. They are of course the
trashed inodes left over from the previous run of pass1b().
The fix is to always invalidate the getnextinode cache when calling
setinodebuf().
Reported by: Chuck Silvers
Tested by: Chuck Silvers
MFC after: 3 days
Sponsored by: Netflix
Pass 1b of fsck_ffs runs only when Pass 1 has found duplicate blocks.
When starting up, Pass 1b failed to properly skip over the two unused
inodes at the beginning of the filesystem resulting in the above error
message when it tried to read the filesystem root inode.
Reported by: Chuck Silvers
Tested by: Chuck Silvers
MFC after: 3 days
Sponsored by: Netflix
Beauty correction for verbose mode or in case we print multiple key
information to not continue with the next options directly after
as we did so far, e.g.:
AES-CCM 2:128-bit
AES-CCM 3:128-bit powersavemode ...
Sponsored-by: The FreeBSD Foundation
MFC-after: 2 weeks
Reviewed-by: adrian
Differential Revision: https://reviews.freebsd.org/D29393
In "heads" output just improve the header to describe all of the columns.
In "hooks" print filter name and hook name delimited with colon, so that
it matches "heads" output and also can be copy-and-pasted straight into
the command line for future "link" command.
After length decisions, we've decided that the if_wg(4) driver and
related work is not yet ready to live in the tree. This driver has
larger security implications than many, and thus will be held to
more scrutiny than other drivers.
Please also see the related message sent to the freebsd-hackers@
and freebsd-arch@ lists by Kyle Evans <kevans@FreeBSD.org> on
2021/03/16, with the subject line "Removing WireGuard Support From Base"
for additional context.
This is the culmination of about a week of work from three developers to
fix a number of functional and security issues. This patch consists of
work done by the following folks:
- Jason A. Donenfeld <Jason@zx2c4.com>
- Matt Dunwoodie <ncon@noconroy.net>
- Kyle Evans <kevans@FreeBSD.org>
Notable changes include:
- Packets are now correctly staged for processing once the handshake has
completed, resulting in less packet loss in the interim.
- Various race conditions have been resolved, particularly w.r.t. socket
and packet lifetime (panics)
- Various tests have been added to assure correct functionality and
tooling conformance
- Many security issues have been addressed
- if_wg now maintains jail-friendly semantics: sockets are created in
the interface's home vnet so that it can act as the sole network
connection for a jail
- if_wg no longer fails to remove peer allowed-ips of 0.0.0.0/0
- if_wg now exports via ioctl a format that is future proof and
complete. It is additionally supported by the upstream
wireguard-tools (which we plan to merge in to base soon)
- if_wg now conforms to the WireGuard protocol and is more closely
aligned with security auditing guidelines
Note that the driver has been rebased away from using iflib. iflib
poses a number of challenges for a cloned device trying to operate in a
vnet that are non-trivial to solve and adds complexity to the
implementation for little gain.
The crypto implementation that was previously added to the tree was a
super complex integration of what previously appeared in an old out of
tree Linux module, which has been reduced to crypto.c containing simple
boring reference implementations. This is part of a near-to-mid term
goal to work with FreeBSD kernel crypto folks and take advantage of or
improve accelerated crypto already offered elsewhere.
There's additional test suite effort underway out-of-tree taking
advantage of the aforementioned jail-friendly semantics to test a number
of real-world topologies, based on netns.sh.
Also note that this is still a work in progress; work going further will
be much smaller in nature.
MFC after: 1 month (maybe)
The kernel-side already accepted a persistent-keepalive-interval, so
just add a verb to ifconfig(8) for it and start exporting it so that
ifconfig(8) can view it.
PR: 253790
MFC after: 3 days
Discussed with: decke
The way that wireguard is designed does not actually require all peers
to have endpoints. In an architecture that might mimic a traditional
VPN server <-> client, the wg interface on a server would have a number
of peers without set endpoints -- the expectation is that the "clients"
will connect to the "server" peer, which will authenticate the
connection as a known peer and learn the endpoint from there.
MFC after: 3 days
Discussed with: decke, grehan (independently)
When the netdump host name fails to resolve, don't print errno, since
it's irrelevant. We might as well use a different exit status, too.
Sponsored by: Dell EMC Isilon
growfs supports growing mounted filesystems (writes are temporarily
suspended while the grow happens). Drop the check for fs_clean == 0
to restore this case. Leave fs_flags check for FS_UNCLEAN or
FS_NEEDSFSCK which represent the state of the filesystem when it was
mounted, and fsck should be run first if they are set.
PR: 253754
Reviewed by: mckusick
MFC after: 3 days
Fixes: 6eb925f845 ("Filesystem utilities that modify the...")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29021
This is a nop in practice, because it cannot be proven that this
particular bzero() is not significant. Make it explicit anyways, rather
than relying on an implementation detail of how the password is
collected.
Discussed with: Andrew Gierth <andrew tao146 riddles org uk>
This should eventually replace the socket passed to the various
handlers. In the meantime, making it global avoids repeatedly opening
and closing handles.
Reported by: kp
Reviewed by: kp (earlier version)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28990
Also trimmed an unused block of code that never prints out LAGG_PROTOS.
Reviewed by: kp (earlier version)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28961
A trivial change now that ifconfig is already using libifconfig.
Reviewed by: kp (earlier version)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28955
After 3e404b8c53, cam_getccb(3) clears the returned CCB, making
a number of calls to CCB_CLEAR_ALL_EXCEPT_HDR(3) unnecessary.
Reviewed By: imp
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D27812
Most of table types currently supported by ipfw have only one
algorithm implementation. When user creates such tables, allow
to omit algo name in arguments. E.g. now it is possible:
ipfw table T1 create type number
ipfw table T2 create type iface
ipfw table T3 create type flow
PR: 233072
MFC after: 1 week
Sponsored by: Yandex LLC
Some SATA drives have 'config' set to 0 in the identify block. Rather than rely
on it, use the strings windows uses to display the drive since they are supposed
to be space padded and will always be non-zero.
Make sys/buf.h, sys/pipe.h, sys/fs/devfs/devfs*.h headers usable in
userspace, assuming that the consumer has an idea what it is for.
Unhide more material from sys/mount.h and sys/ufs/ufs/inode.h,
sys/ufs/ufs/ufsmount.h for consumption of userspace tools, with the
same caveat.
Remove unacceptable hack from usr.sbin/makefs which relied on sys/buf.h
being unusable in userspace, where it override struct buf with its own
definition. Instead, provide struct m_buf and struct m_vnode and adapt
code to use local variants.
Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D28679
Currently when peer information is displayed with `ifconfig wgN peer ..`
or `ifconfig wgN peer-list`, the netmask of the first `allowed-ips` will
be used as the netmask of all CIDR in `allowed-ips`. For example, if
the list is `192.168.1.0/24, 172.16.0.0/16`, it will display as
`192.168.1.0/24, 172.16.0.0/24`. While this does not affect the actual
functionality, it is very confusing.
Submitted by: Michael Chiu <nyan -at- myuji.xyz>
Reviewed by: grehan
Differential Revision: https://reviews.freebsd.org/D28655
MFC after: 1 day
The "source" variable was introduced in r26072, probably as the
traditional counterpart to "target". But the "source"/"target" names
suggest the opposite of their actual meaning. With ln, for example, the
source is the real file and the target is the newly created link. In
mount_nullfs the meaning is the opposite: the target is the existing
file system and the source is the newly created mountpoint. Better to
use "target"/"mountpoint" terminology, which matches the man page.
MFC after: 6 weeks
Sponsored by: Axcient
Mountroot isn't documented in the extant manual pages - so this
phrasing, while less absolute and concise, still conveys which
modules are recommended to be handled via loader.conf(5), and it also
does a better job of elucidating that the modules can include filesystem
drivers.
Submitted by: kevans (earlier version)
Reported by: imp, kevans, eugen
Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D28542
Historically receive buffer overflows have been ignored and programs
could not tell if they missed messages or messages had been truncated
because of overflows. Since programs historically do not expect to get
receive overflow errors, this behavior is not the default.
This is really really important for programs that use route(4) to keep in sync
with the system. If we loose a message then we need to reload the full system
state, otherwise the behaviour from that point is undefined and can lead
to chasing bogus bug reports.
While here, also recommend that loader.conf(5) should only be used in
order to get to mountroot, as rc(8) is less fragile, faster, and is
easier to fix by booting to single-user mode instead of having to
blacklist modules in the loader.
MFH: 2 weeks
The output now contains http-alt instead of 8080 and personal-agent
instead of 5555.
This was probably caused by 228e2087a3.
Reviewed By: kp
Differential Revision: https://reviews.freebsd.org/D28481
The tests create a 1GB test file and this causes the tests to fail in the
CheriBSD CI setup where we run tests with a tmpfs mount on /tmp. Tmpfs
does not support sparse files and it appears that tmpfs default to creating
a 1GB mount, so there is not enough space to run these tests.
Instead of checking for at least 1GB of free space, this commit skips the
tests on file systems that do not support sparse files.
Reviewed By: kevans
Differential Revision: https://reviews.freebsd.org/D28463
to be a true RFC 6598 NAT444 setup, where each network segment (e.g. user,
subnet) can have their own dedicated port aliasing ranges.
Reviewed by: donner, kp
Approved by: 0mp (mentor), donner, kp
Differential Revision: https://reviews.freebsd.org/D23450
Verify that the option is passed, error out if it's not.
The problem can be trivially triggered with `ipfw add allow ext6hdr`.
PR: 253169
Reviewed by: kp@
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D28447
Originally IFCAP_NOMAP meant that the mbuf has external storage pointer
that points to unmapped address. Then, this was extended to array of
such pointers. Then, such mbufs were augmented with header/trailer.
Basically, extended mbufs are extended, and set of features is subject
to change. The new name should be generic enough to avoid further
renaming.
The OID is saved when we encounter CTLFLAG_SKIP so that descendants can
be skipped as well. We then must not update the skip OID until we are
out of the node. This was achieved by resetting the skip OID once the
prefix no longer matches, but the case where the OID we reset on has
CTLFLAG_SKIP was not accounted for.
Reported by: mav
Reviewed by: mav
MFC after: 2 days
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D28364
A long-standing bug in Pass 1 of fsck_ffs in which it is reading in
blocks of inodes to check their block pointers. It failed to round
up the size of the read to a disk block size. When disks would
accept 512-byte aligned reads, the bug rarely manifested itself.
But many recent disks will no longer accept 512-byte aligned reads
but require 4096-byte aligned reads, so the failure to properly
round-up read sizes to multiples of 4096 bytes makes the error
much more likely to occur.
Reported by: Peter Holm and others
Tested by: Peter Holm and Rozhuk Ivan
MFC after: 3 days
Sponsored by: Netflix
There's no need for a special case here to work around the lack of
DIOCGIFSPEED. That was introduced in FreeBSD in
c1aedfcbd9.
Reported by: jmg@
Reviewed by: donner@
Differential Revision: https://reviews.freebsd.org/D28305
WITHOUT_LIBTHR has been broken for a little over five years now, since the
xz 5.2.0 update introduced a hard liblzma dependency on libthr, and building
a useful system without threading support is becoming increasingly more
difficult.
Additionally, in the five plus years that it's been broken more reverse
dependencies have cropped up in libzstd, libsqlite3, and libcrypto (among
others) that make it more and more difficult to reconcile the effort needed
to fix these options.
Remove the broken options.
PR: 252760
Reviewed by: brooks, emaste, kib
Differential Revision: https://reviews.freebsd.org/D28263
QinQ is better known by this name, so accept it as an alias
Reported-by: Mike Geiger
Reviewed-by: melifaro, hselasky, rpokala
MFC-with: 366917
Sponsored-by: Klara Inc.
Differential-Revision: https://reviews.freebsd.org/D28245
A recent email discussion indicated that a large
accumulation of NFSv4 Opens was occurring on
a mount. This appears to have been caused by a
shared library within the mount being used by
several processes, such that there is always at
least one of these processes running.
A new Open was created by each process and
were not closed, since all the Opens were never
closed. This is alleviated by using the
"oneopenown" mount option.
This man page update attempts to indicate the
use of "oneopenown" for this case.
This is a content change.
Reported by: j.david.lists@gmail.com
Reviewed by: 0mp
MFC: 1 month
Differential Revision: https://reviews.freebsd.org/D28215
This rode in with the OpenZFS import. It may have been necessary at some
point, but it is no longer and it breaks the WITHOUT_DYNAMICROOT build as
it collides with the definition in libspl.
Reported-by: Michael Dexter
-R is currently shorthand for cachefile=none, altroot=<mount>. This is
functionally the same, but perhaps more resilient to future changes that
could be necessary that may be added when -R is specified.
MFC after: 1 week
The in_cksum tests originally tried to simulate a BE environment by
swapping the byte order of the input. But that's overcomplicated, and
didn't actually work on real BE hardware. The correct testing strategy
is just to test on the native endianness, and run the tests in both BE
and LE environments.
Submitted by: Renato Riolino <renato.riolino@eldorado.org.br>
Reviewed By: asomers
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D23193
When retrieving the list of group members we cannot simply use
ifa_lookup(), because it expects the interface to have an IP (v4 or v6)
address. This means that interfaces with no address are not found.
This presents as interfacing being alternately marked as skip and not
whenever the rules are re-loaded.
Happily we only need to fix ifa_grouplookup(). Teach it to also accept
AF_LINK (i.e. interface) node_hosts.
PR: 250994
MFC after: 3 days
The general style in sbin/nvmecontrol apppears to print uint64_t types
using %j, so I'm using that instead of the more general (but admittedly
ugly) PRIu64.
Add decoding of the Device Self-test log page and the ability to start
or abort a test.
Reviewed by: imp, mav
Tested by: Muhammad Ahmad <muhammad.ahmad@seagate.com>
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D27517
making fsck_ffs(8) run faster, there should be no functional change.
The original fsck_ffs(8) had its own disk I/O management system.
When gjournal(8) was added to FreeBSD 7, code was added to fsck_ffs(8)
to do the necessary gjournal rollback. Rather than use the existing
fsck_ffs(8) disk I/O system, it wrote its own from scratch. Similarly
when journalled soft updates were added in FreeBSD 9, code was added
to fsck_ffs(8) to do the necessary journal rollback. And once again,
rather than using either of the existing fsck_ffs(8) disk I/O
systems, it wrote its own from scratch. Lastly the fsdb(8) utility
uses the fsck_ffs(8) disk I/O management system. In preparation for
making the changes necessary to enable snapshots to be taken when
using journalled soft updates, it was necessary to have a single
disk I/O system used by all the various subsystems in fsck_ffs(8).
This commit merges the functionality required by all the different
subsystems into a single disk I/O system that supports all of their
needs. In so doing it picks up optimizations from each of them
with the results that each of the subsystems does fewer reads and
writes than it did with its own customized I/O system. It also
greatly simplifies making changes to fsck_ffs(8) since everything
goes through a single place. For example the ginode() function
fetches an inode from the disk. When inode check hashes were added,
they previously had to be checked in the code implementing inode
fetch in each of the three different disk I/O systems. Now they
need only be checked in ginode().
Tested by: Peter Holm
Sponsored by: Netflix
Now that we've split up the datastructures used by the kernel and
userspace there's essentually no more overlap between the pf_ruleset.c
code used by userspace and kernelspace.
Copy the userspace bits to the pfctl directory and stop using the kernel
file.
Reviewed by: philip
MFC after: 2 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D27764
runtime contain what is needed to boot in single user and repair a
system, bectl could be handy to have in this situation.
Differential Revision: https://reviews.freebsd.org/D27708
of its lost+found directory by allocating direct block pointers. The
effect was that it was limited to about 19,000 files. One of Peter Holm's
tests produced a filesystem with about 23,000 lost files which meant
that fsck_ffs was unable to recover it. This update allows lost+found
to be expanded into a single indirect block which allows it to store
up to about 6,573,000 lost files.
Reported by: Peter Holm
Sponsored by: Netflix
If the kernel was built without INET6, default to ICMP. Or, if it was
built without INET, default to ICMPv6.
PR: 251725
Reported by: jbeich
Reviewed by: jbeich
Tested by: jbeich
MFC with: 368045
aout support in ldconfig hasn't been required since FreeBSD 2.x.
If someone needs to use FreeBSD 2 shared libraries they will be best
served by using a FreeBSD 2 ldconfig as well.
In aa5e1b42e6 we removed the ldconfig a.out invocation from rc.d but
left the support in ldconfig itself. Remove it now.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27481
commit 665b1365fe added a new NFS mount option that is used to set a
non-default X.509 certificate, that can be used for nfs-over-tls NFS
mounts.
This patch adds a description for it to the man page.
Reviewed by: 0mp
Differential Revision: https://reviews.freebsd.org/D27733
Fix broken CTLFLAG_SKIP when present on the first child of the requested
node.
We don't need to ignore skip for the first node because in sysctl_all()
we've implicitly visited the first node already when oid is specified.
The first call to show_var() in here is after we have iterated to the
next node. When the command line specifically requests a non-node sysctl
we go straight into show_var() without calling sysctl_all().
Reported by: jhb
Reviewed by: jhb
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D27674
- no blank before trailing delimiter
- missing section argument: Xr inet_pton
- skipping paragraph macro: Pp before Ss
- unusual Xr order: syslogd after sysrc
- tab in filled text
There were a few multiline NAT examples which used the .Dl macro with
tabs. I converted them to .Bd, which is a more suitable macro for that case.
MFC after: 1 week
- inserting missing end of block: Ss breaks Bl
- skipping paragraph macro: Pp before Ss
- referenced manual not found: Xr nvme 4 (2 times)
- unknown standard specifier: St The
The macro .St can only be used for standards known by mdoc(7). So add a
SEE ALSO section and add a reference to the NVM Express Base Specification.
MFC after: 2 weeks
The new name more accurately describes what it does and the file move
puts it with other similar functions. Done in preparation for future
cleanups. No functional differences intended.
Sponsored by: Netflix
Historic Footnote: my last FreeBSD svn commit
Allow geom(8) to list geoms with the '/dev/' prefix.
`geom part show` accepts the '/dev/' prefix but `geom part list` does not.
Modify find_geom() in sbin/geom/core/geom.c to be consistent with the behavior
of find_geom() in lib/geom/part/geom_part.c.
PR: 188213
Reported by: Ronald F. Guilmette <rfg@tristatelogic.com>
Reviewed by: imp, kevans
Approved by: kevans (mentor)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D27556
Detection of interface type by filter must happen before detection of
interface type by prefix. Else the following sequence of commands will
try to create a LAGG interface instead of a VLAN interface, which
accidentially worked previously, because the date pointed to by the
ifr_data pointer was not parsed by VLAN create ioctl(2). This is a
regression after r368229, because the VLAN creation now parses the
ifr_data field.
How to reproduce:
# ifconfig lagg0 create
# ifconfig lagg0.256 create
Differential Revision: https://reviews.freebsd.org/D27521
Reviewed by: kib@ and kevans@
Reported by: raul.munoz@custos.es
Sponsored by: Mellanox Technologies // NVIDIA Networking
This has already confused me once (and I'm pretty sure I wrote it), so let's
clarify: unjailing after the command has completed will only happen if we're
interactive and -U has not been specified.
This just folds two conditionals together to make it obvious how -b/-U
interact with each other.
MFC after: 3 days
PR#250770 was actually just a misunderstanding of what
NFS mount options are needed for AmazonEFS mounts.
This patch attempts to clarify the manpage to clarify this.
This is a content change.
PR: 250770
Reviewed by: bcr
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D27430
Building without INET6 support was already possible. Now it's possible to
build ping with only INET6, or even with neither INET nor INET6.
Reported by: bz
Reviewed by: bz
MFC-With: 368045
Differential Revision: https://reviews.freebsd.org/D27394
When invoked as "ping6", ping will now attempt to use ICMPv6 for hostnames
that resolve both IPv4 and IPv6 addresses.
Reviewed by: bz, manu
MFC-With: r368045
Differential Revision: https://reviews.freebsd.org/D27384
If multiple threads are invoking "ifconfig XXX create" a race may occur
which can lead to two different error messages for the same error.
a) ifconfig: SIOCIFCREATE2: File exists
b) ifconfig: interface XXX already exists
This patch ensures ifconfig prints the same error code
for the same case.
Reviewed by: imp@ and kib@
Differential Revision: https://reviews.freebsd.org/D27380
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking