The size of the journaled soft-updates journal should be big enough
to hold two minutes of filesystem metadata-update activity. The
maximum size of the soft updates journal was set in the 1990s. At
the time it was assummed that disk arrays would top out at 16 drives
and disk writes per drive would top out at 500 per second. Today's
I/O subsystems are considerably bigger and faster than those limits.
Thus this delta removes the hard upper limit and lets tunefs(8) and
newfs(8) set the upper bound based on the size of the filesystem and
its cylinder groups.
Sponsored by: The FreeBSD Foundation
Further updates based on ways Peter Holm found to corrupt UFS
superblocks in ways that could cause kernel hangs or crashes.
No legitimate superblocks should fail as a result of these changes.
Reported by: Peter Holm
Tested by: Peter Holm
Sponsored by: The FreeBSD Foundation
While syntactically correct and even looking correct, it was definitely
not providing the desired result. And it has been this way for nearly
twenty years.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
The "update" mount option must be specified when the "snapshot"
mount option is used. Return EINVAL if the "snapshot" option is
specified without the "update" option also requested.
Reported by: Robert Morris
Reviewed by: kib
PR: 265362
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
This removes some of the complexity needed to maintain HASBUF and
allows for removing injecting SAVENAME by filesystems.
Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D36542
Yet more updates based on ways Peter Holm found to corrupt UFS
superblocks in ways that could cause kernel hangs or crashes.
No legitimate superblocks should fail as a result of these changes.
Reported by: Peter Holm
Tested by: Peter Holm
Sponsored by: The FreeBSD Foundation
Further updates based on ways Peter Holm found to corrupt UFS
superblocks in ways that could cause kernel hangs or crashes.
No legitimate superblocks should fail as a result of these changes.
Reported by: Peter Holm
Tested by: Peter Holm
Sponsored by: The FreeBSD Foundation
Further updates based on ways Peter Holm found to corrupt UFS
superblocks in ways that could cause kernel hangs or crashes.
No legitimate superblocks should fail as a result of these changes.
Reported by: Peter Holm
Tested by: Peter Holm
Sponsored by: The FreeBSD Foundation
The function ffs_vgetf() is used to find or load UFS inodes into a
vnode. It first looks up the inode and if found in the cache its
vnode is returned. If it is not already in the cache, a new vnode
is allocated and its associated inode read in from the disk. The
read is done even for inodes that are being initially created.
The contents for the inode on the disk are assumed to be empty. If
the on-disk contents had been corrupted either due to a hardware
glitch or an agent deliberately trying to exploit the system, the
UFS code could panic from the unexpected partially-allocated inode.
Rather then having fsck_ffs(8) verify that all unallocated inodes
are properly empty, it is easier and quicker to add a flag to
ffs_vgetf() to indicate that the request is for a newly allocated
inode. When set, the disk read is skipped and the inode is set to
its expected empty (zero'ed out) initial state. As a side benefit,
an unneeded disk I/O is avoided.
Reported by: Peter Holm
Sponsored by: The FreeBSD Foundation
into ffs_sbsearch() to allow use by other parts of the system.
Historically only fsck_ffs(8), the UFS filesystem checker, had code
to track down and use alternate UFS superblocks. Since fsdb(8) used
much of the fsck_ffs(8) implementation it had some ability to track
down alternate superblocks.
This change extracts the code to track down alternate superblocks
from fsck_ffs(8) and puts it into a new function ffs_sbsearch() in
sys/ufs/ffs/ffs_subr.c. Like ffs_sbget() and ffs_sbput() also found
in ffs_subr.c, these functions can be used directly by the kernel
subsystems. Additionally they are exported to the UFS library,
libufs(8) so that they can be used by user-level programs. The new
functions added to libufs(8) are sbfind(3) that is an alternative
to sbread(3) and sbsearch(3) that is an alternative to sbget(3).
See their manual pages for further details.
The utilities that have been changed to search for superblocks are
dumpfs(8), fsdb(8), ffsinfo(8), and fsck_ffs(8). Also, the prtblknos(8)
tool found in tools/diag/prtblknos searches for superblocks.
The UFS specific mount code uses the superblock search interface
when mounting the root filesystem and when the administrator doing
a mount(8) command specifies the force flag (-f). The standalone UFS
boot code (found in stand/libsa/ufs.c) uses the superblock search
code in the hope of being able to get the system up and running so
that fsck_ffs(8) can be used to get the filesystem cleaned up.
The following utilities have not been changed to search for
superblocks: clri(8), tunefs(8), snapinfo(8), fstyp(8), quot(8),
dump(8), fsirand(8), growfs(8), quotacheck(8), gjournal(8), and
glabel(8). When these utilities fail, they do report the cause of
the failure. The one exception is the tasting code used to try and
figure what a given disk contains. The tasting code will remain
silent so as not to put out a slew of messages as it trying to taste
every new mass storage device that shows up.
Reviewed by: kib
Reviewed by: Warner Losh
Tested by: Peter Holm
Differential Revision: https://reviews.freebsd.org/D36053
Sponsored by: The FreeBSD Foundation
The BIOS loader operates in a very constrained environment. The messages
for the super block integrity tests take up about 12k of space. Compile
them out for the BIOS loader, while leaving it intact for all other
loaders that aren't space constrained. These aren't used in the 'super
tiny' *boot* programs, so no adjustment is needed there.
We reply on the fact that (a) i386 doesn't support 32-bit UEFI booting
and (b) LIBSA_CPUARCH is "i386" when building on both i386 and when
we're building the 32-bit libsa32 library.
This saves about 12k of space for this constrained envrionment and will
take a bit of the pressure off some machines where the loader has grown
too big for their BIOS (see comments in i386/loader/Makefile for
details).
Sponsored by: Netflix
Reviewed by: mckusick
Differential Revision: https://reviews.freebsd.org/D36175
Make most AST handlers dynamically registered. This allows to have
subsystem-specific handler source located in the subsystem files,
instead of making subr_trap.c aware of it. For instance, signal
delivery code on return to userspace is now moved to kern_sig.c.
Also, it allows to have some handlers designated as the cleanup (kclear)
type, which are called both at AST and on thread/process exit. For
instance, ast(), exit1(), and NFS server no longer need to be aware
about UFS softdep processing.
The dynamic registration also allows third-party modules to register AST
handlers if needed. There is one caveat with loadable modules: the
code does not make any effort to ensure that the module is not unloaded
before all threads processed through AST handler in it. In fact, this
is already present behavior for hwpmc.ko and ufs.ko. I do not think it
is worth the efforts and the runtime overhead to try to fix it.
Reviewed by: markj
Tested by: emaste (arm64), pho
Discussed with: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D35888
Identify each of the superblock validation checks as either a
warning or a fatal error. Any integrity check that can cause a
system hang or crash is marked as fatal. Those that may simply
lead to poor file layoutor other less good operating conditions
are marked as warning.
Normally both fatal and warning are treated as errors and prevent
the superblock from being loaded. A new flag, UFS_NOWARNFAIL, is
added. When passed to ffs_sbget() it will note warnings that it
finds, but will still proceed with loading the superblock. Note
that when UFS_NOWARNFAIL is used, it also includes UFS_NOHASHFAIL.
No legitimate superblocks should fail as a result of these changes.
Further updates based on analysis of the way the fields are used
in the various filesystem macros defined in fs.h.
Eliminate several checks for non-negative values where the fields
are checked for specific values. Since these specific values are
non-negative, if the value is a verified positive value then it
cannot be negative and such a check is redundant and unnecessary.
No legitimate superblocks should fail as a result of these changes.
Rather than trying to shoehorn flags into the requested superblock
address, create a separate flags parameter to the ffs_sbget()
function in sys/ufs/ffs/ffs_subr.c. The ffs_sbget() function is
used both in the kernel and in user-level utilities through export
to the sbget() function in the libufs(3) library (see sbget(3)
for details). The kernel uses ffs_sbget() when mounting UFS
filesystems, in the glabel(8) and gjournal(8) GEOM utilities,
and in the standalone library used when booting the system
from a UFS root filesystem.
The ffs_sbget() function reads the superblock located at the byte
offset specified by its sblockloc parameter. The value UFS_STDSB
may be specified for sblockloc to request that the standard
location for the superblock be read.
The two existing options are now flags:
UFS_NOHASHFAIL will note if the check hash is wrong but will still
return the superblock. This is used by the bootstrap code to
give the system a chance to come up so that fsck can be run to
correct the problem.
UFS_NOMSG indicates that superblock inconsistency error messages
should not be printed. It is used by programs like fsck that
want to print their own error message and programs like glabel(8)
that just want to know if a UFS filesystem exists on a partition.
One additional flag is added:
UFS_NOCSUM causes only the superblock itself to be returned, but does
not read in any auxiliary data structures like the cylinder group
summary information. It is used by clients like glabel(8) that
just want to check for possible filesystem types. Using UFS_NOCSUM
skips the superblock checks for csum data which allows superblocks
that have corrupted csum data to be read and used.
The validate_sblock() function checks that the superblock has not
been corrupted in a way that can crash or hang the system. Unless
the UFS_NOMSG flag is specified, it will print out any errors that
it finds. Prior to this commit, validate_sblock() returned as soon
as it found an inconsistency so would print at most one message.
It now does all its checks so when UFS_NOMSG has not been specified
will print out everything that it finds inconsistent.
Sponsored by: The FreeBSD Foundation
Reorder a few checks to ensure fields have been checked before
using them to check other fields.
Add eight new checks mostly checking for non-negative values.
No legitimate superblocks should fail as a result of these changes.
With clang 15, the following -Werror warnings are produced:
sys/ufs/ufs/ufs_dirhash.c:1303:16: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
ufsdirhash_init()
^
void
sys/ufs/ufs/ufs_dirhash.c:1319:18: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
ufsdirhash_uninit()
^
void
This is because ufsdirhash_init() and ufsdirhash_uninit() are declared
with (void) argument lists, but defined with empty argument lists. Make
the definitions match the declarations.
MFC after: 3 days
With clang 15, the following -Werror warning is produced:
sys/ufs/ffs/ffs_snapshot.c:204:7: error: variable 'redo' set but not used [-Werror,-Wunused-but-set-variable]
long redo = 0, snaplistsize = 0;
^
The 'redo' variable is only used when DIAGNOSTIC is defined. Ensure it
is only declared and set in that case.
MFC after: 3 days
With clang 15, the following -Werror warning is produced:
sys/ufs/ufs/ufs_dirhash.c:1252:18: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
ufsdirhash_lowmem()
^
void
This is ufsdirhash_lowmem() is declared with a (void) argument list, but
defined with an empty argument list. Make the definition match the
declaration.
MFC after: 3 days
A better fix to commit 9e1f44d044. Rather than coping with the case
where a backup superblock is used, catch the case when the superblock
is being read in and ensure that the standard one is used rather than
the backup one.
The K&R style in UFS and other places in the tree's days are numbered
as this syntax is removed in C2x proposal N2432:
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2432.pdf
Though running to nearly 6000 lines of diffs this update should cause
no functional change to the code.
Requested by: Warner Losh
MFC after: 2 weeks
Older versions of growfs(8) failed to correctly update fs_dsize.
Filesystems that have been grown fail the test for fs_dsize's correct
value. For now we exclude the fs_dsize test from the requirements.
Reported by: Edward Tomasz Napiera
Tested by: Edward Tomasz Napiera
Tested by: Peter Holm
MFC after: 1 month (with 076002f24d)
Differential Revision: https://reviews.freebsd.org/D35219
The original check verified that if an alternate superblock has not
been selected that the superblock is located in its standard location.
For UFS1 the with a 65536 block size, the first backup superblock
is at the same location as the UFS2 superblock. Since SBLOCK_UFS2
is the first location checked, the first backup is the superblock
that will be used for a UFS1 filesystems with a 65536 block size.
This patch allows the use of the first backup superblock in that
situation.
Reported by: Peter Holm
Tested by: Peter Holm
MFC after: 1 month (with 076002f24d)
Differential Revision: https://reviews.freebsd.org/D35219
The tests for number of cylinder groups (fs_ncg), inodes per cylinder
group (fs_ipg), and the size and layout of the cylinder group summary
information (fs_csaddr and fs_cssize) were overly restrictive and
would exclude some valid filesystems. These updates avoid precluding
valid fiesystems while still detecting rogue values that can crash or
hang the kernel.
Reported by: Chuck Silvers
Tested by: Peter Holm
MFC after: 1 month (with 076002f24d)
Differential Revision: https://reviews.freebsd.org/D35219
Otherwise the mount point could be unmounted meantime.
Reported and tested by: pho
Reviewed by: jah
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D35638
vn_read_from_obj() requires that all pages of a vnode (except the last
partial page) be either completely valid or completely invalid,
but for file systems with block size smaller than PAGE_SIZE,
partially valid pages may exist anywhere in the file.
Do not enable the vn_read_from_obj() path in this case.
Reviewed by: mckusick, kib, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D34836
i_nlink overflow might be transient, i_effnlink indicates the final
value of the link count after all dependencies would be resolved. So if
i_nlink reached the maximum but i_efflink did not, we should be able to
make the link by syncing.
We must sync the whole filesystem to resolve dependencies,
which requires unlocking vnodes locked for VOPs. Use existing
ERELOOKUP/VOP_UNLOCK_PAIR() mechanism to restart the VOP if sync with
unlock was done.
PR: 165392
Reported by: Vsevolod Volkov <vvv@colocall.net>
Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D35514
The "offset" argument to vn_io_fault_pgmove() is supposed to be
the offset within the page, but for ffs we currently use the offset
within the block. When the block size is at least as large as the
page size then these values are the same, but when the page size is
larger than the block size then we need to add the offset of
the block within the page as well.
Sponsored by: Netflix
Reviewed by: mckusick, kib, markj
Differential Revision: https://reviews.freebsd.org/D34835
One of the checks was that the cylinder group size (fs_cgsize)
matched that calculated by CGSIZE(). The value calculated by CGSIZE()
has changed over time as the filesystem has evolved. Thus comparing
the value of CGSIZE() of the current generation filesystem may not
match the size as computed by CGSIZE() that was in effect at the
time an older filesystem was created. Therefore the check for
fs_cgsize is changed to simply ensure that it is not larger than
the filesystem blocksize (fs_bsize).
Reported by: Martin Birgmeier
Tested by: Martin Birgmeier
MFC after: 1 month (with 076002f24d)
PR: 264450
Differential Revision: https://reviews.freebsd.org/D35219
Two bugs have been reported with the UFS/FFS superblock integrity
checks that were added in commit 076002f24d.
The code checked that fs_sblockactualloc was properly set to the
location of the superblock. The fs_sblockactualloc field was an
addition to the superblock in commit dffce2150e on Jan 26 2018
and used a field that was zero in filesystems created before it
was added. The integrity check had to be expanded to accept the
fs_sblockactualloc field being zero so as not to reject filesystems
created before Jan 26 2018.
The integrity check set an upper bound on the value of fs_maxcontig
based on the maximum transfer size supported by the kernel. It
required that fs->fs_maxcontig <= maxphys / fs->fs_bsize. The kernel
variable maxphys defines the maximum transfer size permitted by the
controllers and/or buffering. The fs_maxcontig parameter controls the
maximum number of blocks that the filesystem will read or write in
a single transfer. It is calculated when the filesystem is created
as maxphys / fs_bsize. The bug appeared in the loader because it
uses a maxphys of 128K even when running on a system that supports
larger values. If the filesystem was built on a system that supports
a larger maxphys (1M is typical) it will have configured fs_maxcontig
for that larger system so would fail the test when run with the smaller
maxphys used by the loader. So we bound the upper allowable limit
for fs_maxconfig to be able to at least work with a 1M maxphys on the
smallest block size filesystem: 1M / 4096 == 256. We then use the
limit for fs_maxcontig as fs_maxcontig <= MAX(256, maxphys / fs_bsize).
There is no harm in allowing the mounting of filesystems that make larger
than maxphys I/O requests because those (mostly 32-bit machines) can
(very slowly) handle I/O requests that exceed maxphys.
Thanks to everyone who helped sort out the problems and the fixes.
Reported by: Cy Schubert, David Wolfskill
Diagnosis by: Mark Johnston, John Baldwin
Reviewed by: Warner Losh
Tested by: Cy Schubert, David Wolfskill
MFC after: 1 month (with 076002f24d)
Differential Revision: https://reviews.freebsd.org/D35219
Historically only minimal checks were made of a superblock when it
was read in as it was assumed that fsck would have been run to
correct any errors before attempting to use the filesystem. Recently
several bug reports have been submitted reporting kernel panics
that can be triggered by deliberately corrupting filesystem superblocks,
see Bug 263979 - [meta] UFS / FFS / GEOM crash (panic) tracking
which is tracking the reported corruption bugs.
This change upgrades the checks that are performed. These additional
checks should prevent panics from a corrupted superblock. Although
it appears in only one place, the new code will apply to the kernel
modules and (through libufs) user applications that read in superblocks.
Reported by: Robert Morris and Neeraj
Reviewed by: kib
Tested by: Peter Holm
PR: 263979
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D35219
This is needed for in-kernel copy of the code, where allocation might
happen after fs_fmod is cleared in ffs_sbput() but before the write.
Reported by: markj
Reviewed by: chs, markj
PR: 263765
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D35149
Copy in-memory struct fs to the superblock buffer under the UFS mutex.
Reviewed by: chs, markj
PR: 263765
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D35149