Commit Graph

86 Commits

Author SHA1 Message Date
Kirk McKusick
f4fc389524 Properly handle the replacement of a partially allocated root directory.
If the root directory exists but has a bad block number Pass1 will
accept it and setup an inoinfo structure for it. When Pass2 runs
and cannot read the root inode's content because of a bad (or
duplicate) block number, it removes the bad root inode and replaces
it. As part of creating the replacement root inode, it creates an
inoinfo entry for it. But Pass2 did delete the inoinfo entry that
Pass1 had set up for the root inode so ended up with two inoinfo
structures for it. The final step of Pass2 checks that all the ".."
entries are correct adding them if they are missing which resulted
in a second ".." entry being added to the root directory which
definitely did not go over well in the kernel name cache!

Reported by:  Peter Holm
Sponsored by: The FreeBSD Foundation
2022-09-03 14:48:34 -07:00
Kirk McKusick
e688661642 Move the ability to search for alternate UFS superblocks from fsck_ffs(8)
into ffs_sbsearch() to allow use by other parts of the system.

Historically only fsck_ffs(8), the UFS filesystem checker, had code
to track down and use alternate UFS superblocks. Since fsdb(8) used
much of the fsck_ffs(8) implementation it had some ability to track
down alternate superblocks.

This change extracts the code to track down alternate superblocks
from fsck_ffs(8) and puts it into a new function ffs_sbsearch() in
sys/ufs/ffs/ffs_subr.c. Like ffs_sbget() and ffs_sbput() also found
in ffs_subr.c, these functions can be used directly by the kernel
subsystems. Additionally they are exported to the UFS library,
libufs(8) so that they can be used by user-level programs. The new
functions added to libufs(8) are sbfind(3) that is an alternative
to sbread(3) and sbsearch(3) that is an alternative to sbget(3).
See their manual pages for further details.

The utilities that have been changed to search for superblocks are
dumpfs(8), fsdb(8), ffsinfo(8), and fsck_ffs(8). Also, the prtblknos(8)
tool found in tools/diag/prtblknos searches for superblocks.

The UFS specific mount code uses the superblock search interface
when mounting the root filesystem and when the administrator doing
a mount(8) command specifies the force flag (-f). The standalone UFS
boot code (found in stand/libsa/ufs.c) uses the superblock search
code in the hope of being able to get the system up and running so
that fsck_ffs(8) can be used to get the filesystem cleaned up.

The following utilities have not been changed to search for
superblocks: clri(8), tunefs(8), snapinfo(8), fstyp(8), quot(8),
dump(8), fsirand(8), growfs(8), quotacheck(8), gjournal(8), and
glabel(8). When these utilities fail, they do report the cause of
the failure. The one exception is the tasting code used to try and
figure what a given disk contains. The tasting code will remain
silent so as not to put out a slew of messages as it trying to taste
every new mass storage device that shows up.

Reviewed by: kib
Reviewed by: Warner Losh
Tested by:   Peter Holm
Differential Revision: https://reviews.freebsd.org/D36053
Sponsored by: The FreeBSD Foundation
2022-08-13 12:43:40 -07:00
Kirk McKusick
bf46c0a9ae Clean up comments in fsck.h.
No functional change.
2022-05-10 16:06:15 -07:00
Kirk McKusick
c5d476c98c Update fsdb(8) to reflect new structure of fsck_ffs(8).
The cleanup of fsck_ffs(8) in commit c0bfa109b9 broke fsdb(8).
This commit adds the one-line update needed in fsdb(8) to make it
work with the new fsck_ffs(8) structure.

Reported by: Chuck Silvers
Tested by:   Chuck Silvers
MFC after:   3 days
2022-02-23 15:40:58 -08:00
Kirk McKusick
c0bfa109b9 Have fsck_ffs(8) properly correct superblock check-hash failures.
Part of the problem was that fsck_ffs would read the superblock
multiple times complaining and repairing the superblock check hash
each time and then at the end failing to write out the superblock
with the corrected check hash. This fix reads the superblock just
once and if the check hash is corrected ensures that the fixed
superblock gets written.

Tested by:    Peter Holm
PR:           245916
MFC after:    1 week
Sponsored by: Netflix
2022-02-04 11:47:48 -08:00
Kirk McKusick
fc56fd262d Ensure that all allocated data structures in fsck_ffs are freed.
Several large data structures are allocated by fsck_ffs to track
resource usage. Most but not all were deallocated at the end of
checking each filesystem. This commit consolidates the freeing
of all data structures in one place and adds one that had previously
been missing.

It is important to clean up these data structures as they can be
large. If the previous allocations have not been freed, fsck_ffs
can run out of address space when many large filesystems are being
checked. An alternative would be to fork a new instance of fsck_ffs
for each filesystem to be checked, but we choose to free the small
set of large structures to save the fork overhead.

Reported by:  Chuck Silvers
Tested by:    Chuck Silvers
MFC after:    7 days
Sponsored by: Netflix
2021-04-02 11:58:49 -07:00
Kirk McKusick
5cc52631b3 Rewrite the disk I/O management system in fsck_ffs(8). Other than
making fsck_ffs(8) run faster, there should be no functional change.

The original fsck_ffs(8) had its own disk I/O management system.
When gjournal(8) was added to FreeBSD 7, code was added to fsck_ffs(8)
to do the necessary gjournal rollback. Rather than use the existing
fsck_ffs(8) disk I/O system, it wrote its own from scratch. Similarly
when journalled soft updates were added in FreeBSD 9, code was added
to fsck_ffs(8) to do the necessary journal rollback. And once again,
rather than using either of the existing fsck_ffs(8) disk I/O
systems, it wrote its own from scratch. Lastly the fsdb(8) utility
uses the fsck_ffs(8) disk I/O management system. In preparation for
making the changes necessary to enable snapshots to be taken when
using journalled soft updates, it was necessary to have a single
disk I/O system used by all the various subsystems in fsck_ffs(8).

This commit merges the functionality required by all the different
subsystems into a single disk I/O system that supports all of their
needs. In so doing it picks up optimizations from each of them
with the results that each of the subsystems does fewer reads and
writes than it did with its own customized I/O system. It also
greatly simplifies making changes to fsck_ffs(8) since everything
goes through a single place. For example the ginode() function
fetches an inode from the disk. When inode check hashes were added,
they previously had to be checked in the code implementing inode
fetch in each of the three different disk I/O systems. Now they
need only be checked in ginode().

Tested by:    Peter Holm
Sponsored by: Netflix
2021-01-07 15:03:15 -08:00
Kirk McKusick
68dc94c7d3 Correct and add some comments.
Sponsored by: Netflix
2020-12-31 15:36:33 -08:00
Kirk McKusick
7180f1ab40 Rename pass4check() to freeblock() and move from pass4.c to inode.c.
The new name more accurately describes what it does and the file move
puts it with other similar functions. Done in preparation for future
cleanups. No functional differences intended.

Sponsored by: Netflix
Historic Footnote: my last FreeBSD svn commit
2020-12-18 23:28:27 +00:00
Kyle Evans
c3e9752ea1 fsck_ffs/fsdb: fix -fno-common build
This one is also a small list:

- 3x duplicate definition (ufs2_zino, returntosingle, nflag)
- 5x 'needs extern', 3/5 of which are referenced in fsdb

-fno-common will become the default in GCC10/LLVM11.

MFC after:	1 week
2020-03-29 20:03:46 +00:00
Kirk McKusick
0061238fb0 This update eliminates a kernel stack disclosure bug in UFS/FFS
directory entries that is caused by uninitialized directory entry
padding written to the disk. It can be viewed by any user with read
access to that directory. Up to 3 bytes of kernel stack are disclosed
per file entry, depending on the the amount of padding the kernel
needs to pad out the entry to a 32 bit boundry. The offset in the
kernel stack that is disclosed is a function of the filename size.
Furthermore, if the user can create files in a directory, this 3
byte window can be expanded 3 bytes at a time to a 254 byte window
with 75% of the data in that window exposed. The additional exposure
is done by removing the entry, creating a new entry with a 4-byte
longer name, extracting 3 more bytes by reading the directory, and
repeating until a 252 byte name is created.

This exploit works in part because the area of the kernel stack
that is being disclosed is in an area that typically doesn't change
that often (perhaps a few times a second on a lightly loaded system),
and these file creates and unlinks themselves don't overwrite the
area of kernel stack being disclosed.

It appears that this bug originated with the creation of the Fast
File System in 4.1b-BSD (Circa 1982, more than 36 years ago!), and
is likely present in every Unix or Unix-like system that uses
UFS/FFS. Amazingly, nobody noticed until now.

This update also adds the -z flag to fsck_ffs to have it scrub
the leaked information in the name padding of existing directories.
It only needs to be run once on each UFS/FFS filesystem after a
patched kernel is installed and running.

Submitted by: David G. Lawrence <dg@dglawrence.com>
Reviewed by:  kib
MFC after:    1 week
2019-05-03 21:54:14 +00:00
Kirk McKusick
d483391306 Followup to -r344552 in which fsck_ffs checks for a size past the
last allocated block of the file and if that is found, shortens the
file to reference the last allocated block thus avoiding having it
reference a hole at its end.

This update corrects an error where fsck_ffs miscalculated the last
logical block of the file when the file contained a large hole.

Reported by:  Jamie Landeg-Jones
Tested by:    Peter Holm
MFC after:    2 weeks
Sponsored by: Netflix
2019-04-13 13:31:06 +00:00
Kirk McKusick
ac4b20a0a7 After a crash, a file that extends into indirect blocks may end up
shorter than its size resulting in a hole as its final block (which
is a violation of the invarients of the UFS filesystem).

Soft updates will always ensure that the file size is correct when
writing inodes to disk for files that contain only direct block
pointers. However soft updates does not roll back sizes for files
with indirect blocks that it has set to unallocated because their
contents have not yet been written to disk. Hence, the file can
appear to have a hole at its end because the block pointer has been
rolled back to zero when its inode was written to disk. Thus,
fsck_ffs calculates the last allocated block in the file. For files
that extend into indirect blocks, fsck_ffs checks for a size past
the last allocated block of the file and if that is found, shortens
the file to reference the last allocated block thus avoiding having
it reference a hole at its end.

Submitted by: Chuck Silvers <chs@netflix.com>
Tested by:    Chuck Silvers <chs@netflix.com>
MFC after:    1 week
Sponsored by: Netflix
2019-02-25 21:58:19 +00:00
Kirk McKusick
8ebae128be Ensure that cylinder-group check-hashes are properly updated when first
creating them and when correcting them when they are found to be corrupted.

Reported by:  Don Lewis (truckman@)
Sponsored by: Netflix
2018-12-05 06:31:50 +00:00
Kirk McKusick
9fc5d538fc In preparation for adding inode check-hashes, clean up and
document the libufs interface for fetching and storing inodes.
The undocumented getino / putino interface has been replaced
with a new getinode / putinode interface.

Convert the utilities that had been using the undocumented
interface to use the new documented interface.

No functional change (as for now the libufs library does not
do inode check-hashes).

Reviewed by:  kib
Tested by:    Peter Holm
Sponsored by: Netflix
2018-11-13 21:40:56 +00:00
Kirk McKusick
2c288c95d9 In preparation for adding inode check-hashes, change the fsck_ffs
inodirty() function to have a pointer to the inode being dirtied.
No functional change (as for now the parameter is ununsed).

Sponsored by: Netflix
2018-10-31 05:17:53 +00:00
Kirk McKusick
31461aa2f1 Include files missed in 329051. 2018-02-08 23:14:24 +00:00
Kirk McKusick
dffce2150e Refactoring of reading and writing of the UFS/FFS superblock.
Specifically reading is done if ffs_sbget() and writing is done
in ffs_sbput(). These functions are exported to libufs via the
sbget() and sbput() functions which then used in the various
filesystem utilities. This work is in preparation for adding
subperblock check hashes.

No functional change intended.

Reviewed by: kib
2018-01-26 00:58:32 +00:00
Kirk McKusick
a6bbdf81b5 More throughly integrate libufs into fsck_ffs by using its cgput()
routine to write out the cylinder groups rather than recreating the
calculation of the cylinder-group check hash in fsck_ffs.

No functional change intended.
2018-01-24 23:57:40 +00:00
Kirk McKusick
957fc241ec Rename cgget => cglookup to clear name space for new libufs function cgget.
No functional change.
2018-01-17 06:31:21 +00:00
David Bright
469759f8e4 Exit fsck_ffs with non-zero status when file system is not repaired.
When the fsck_ffs program cannot fully repair a file system, it will
output the message PLEASE RERUN FSCK. However, it does not exit with a
non-zero status in this case (contradicting the man page claim that it
"exits with 0 on success, and >0 if an error occurs."  The fsck
rc-script (when running "fsck -y") tests the status from fsck (which
passes along the exit status from fsck_ffs) and issues a "stop_boot"
if the status fails. However, this is not effective since fsck_ffs can
return zero even on (some) errors. Effectively, it is left to a later
step in the boot process when the file systems are mounted to detect
the still-unclean file system and stop the boot.

This change modifies fsck_ffs so that when it cannot fully repair the
file system and issues the PLEASE RERUN FSCK message it also exits
with a non-zero status.

While here, the fsck_ffs man page has also been updated to document
the failing exit status codes used by fsck_ffs. Previously, only exit
status 7 was documented. Some of these exit statuses are tested for in
the fsck rc-script, so they are clearly depended upon and deserve
documentation.

Reviewed by:	mckusick, vangyzen, jilles (manpages)
MFC after:	1 week
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D13862
2018-01-15 19:25:11 +00:00
Pedro F. Giffuni
8a16b7a18f General further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
2017-11-20 19:49:47 +00:00
Pedro F. Giffuni
f671769766 fsck_ffs: Unsign some variables and make use of reallocarray(3).
Instead of casting listmax and numdirs to unsigned values just define
them as unsigned and avoid the casts. Use reallocarray(3).

While here, fs_ncg is already unsigned so the cast is unnecessary.

Reviewed by:	mckusick
MFC after:	2 weeks
2017-04-22 14:50:11 +00:00
Warner Losh
fbbd9655e5 Renumber copyright clause 4
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by:	Jan Schaumann <jschauma@stevens.edu>
Pull Request:	https://github.com/freebsd/freebsd/pull/96
2017-02-28 23:42:47 +00:00
Kirk McKusick
6a5972db72 Fsck_ufs was using an int rather than a ufs2_daddr_t to store the
alternate superblock location when given in the -b option. When int
is 32-bits, block numbers larger than 2^32 would get truncated. This
commit changes the storage fpr the alternate superblock location
to a ufs2_daddr_t.

Submitted by: Dmitry Sivachenko <trtrmitya@gmail.com>
2016-08-19 00:03:41 +00:00
Eitan Adler
463a577b27 Fix a ton of speelling errors
arc lint is helpful

Reviewed By: allanjude, wblock, #manpages, chris@bsdjunk.com
Differential Revision: https://reviews.freebsd.org/D3337
2015-10-21 05:37:09 +00:00
Kirk McKusick
eff68496e2 Arguments for malloc and calloc should be size_t, not int.
Use proper bounds check when trying to free cached memory.

Spotted by: Xin Li
Tested by:  Dmitry Sivachenko
MFC after:  2 weeks
2014-02-25 18:25:27 +00:00
Scott Long
7703a6ff27 Add the -R option to allow fsck_ffs to restart itself when too many critical
errors have been detected in a particular run.

Clean up the global state variables so that a restart can happen correctly.

Separate the global variables in fsck_ffs and fsdb to their own file.  This
fixes header sharing with fscd.

Correctly initialize, static-ize, and remove global variables as needed in
dir.c.  This fixes a problem with lost+found directories that was causing
a segfault.

Correctly initialize, static-ize, and remove global variables as needed in
suj.c.

Initialize the suj globals before allocating the disk object, not after.
Also ensure that 'preen' mode doesn't conflict with 'restart' mode

Submitted by:	scottl, max
Reviewed by:	max, mckusick (earlier version)
Obtained from:	Netflix
MFC after:	3 days
2013-12-30 01:16:08 +00:00
Scott Long
ce779f3756 Add a 'surrender' mode to fsck_ffs. With the -S flag, once hard read errors
are encountered, the fsck will stop instead of wasting time chewing through
possibly other errors.

Obtained from:	Netflix
MFC after:	3 days
2013-07-30 22:57:12 +00:00
Dag-Erling Smørgrav
2b5373de83 Add a -Z option which zeroes unused blocks. It can be combined with -E,
in which case unused blocks are first zeroed and then erased.

Reviewed by:	mckusick
MFC after:	3 weeks
2013-04-29 20:13:09 +00:00
Kirk McKusick
81fbded23f Revert 248634 and 248643 (e.g., restoring 248625 and 248639).
Build verified by: Glen Barber (gjb@)
2013-03-23 20:00:02 +00:00
Sean Bruno
115f80b8d3 Revert svn r248625
Clang errors around printf could be trivially fixed, but the breakage in
sbin/fsdb were to significant for this type of change.

Submitter of this changeset has been notified and hopefully this can be
restored soon.
2013-03-23 04:26:13 +00:00
Kirk McKusick
776816d32b Speed up fsck by caching the cylinder group maps in pass1 so
that they do not need to be read again in pass5. As this nearly
doubles the memory requirement for fsck, the cache is thrown away
if other memory needs in fsck would otherwise fail. Thus, the
memory footprint of fsck remains unchanged in memory constrained
environments.

This work was inspired by a paper presented at Usenix's FAST '13:
www.usenix.org/conference/fast13/ffsck-fast-file-system-checker

Details of this implementation appears in the April 2013 of ;login:
www.usenix.org/publications/login/april-2013-volume-38-number-2.
A copy of the April 2013 ;login: paper can also be downloaded
from: www.mckusick.com/publications/faster_fsck.pdf.

Reviewed by: kib
Tested by:   Peter Holm
MFC after:   4 weeks
2013-03-22 21:50:43 +00:00
Kirk McKusick
ed75b5a156 When running with the -d option, instrument fsck_ffs to track the number,
data type, and running time of its I/O operations.

No functional changes.
2013-02-24 06:44:29 +00:00
Kirk McKusick
2ec5c91481 Update fsck_ffs buffer cache manager to use TAILQ macros.
No functional changes.
2013-02-15 01:00:48 +00:00
David E. O'Brien
4a83537505 Remove needless (int) casts of write(2)'s 3rd argument.
Also change blwrite() 'size' parameter to a ssize_t to better match
write(2).
2012-09-12 15:36:44 +00:00
Ulrich Spörlein
4b85a12f71 Spelling fixes for sbin/ 2012-01-07 16:09:33 +00:00
Konstantin Belousov
6f100596e5 Change the type of real_dev_bsize variable from long to u_int.
The DIOCGSECTORSIZE takes u_int * as an argument, using long *
causes failures on big-endian targets.

Diagnosed by:	Michiel Boland <boland37 xs4all nl>
PR:	sparc64/163460
Tested by:	pho (x86), flo (sparc64)
MFC after:	1 week
2011-12-20 20:39:00 +00:00
Kirk McKusick
d2404f0470 Break out the pass 5 inode and block map updating into a separate function
so that the function can be used by the journaling soft updates recovery.
2011-07-15 15:43:40 +00:00
Dag-Erling Smørgrav
8d3dfc2691 Add an -E option to mirror newfs's. The idea is that if you have a system
that was built before ffs grew support for TRIM, your filesystem will have
plenty of free blocks that the flash chip doesn't know are free, so it
can't take advantage of them for wear leveling.  Once you've upgraded your
kernel, you enable TRIM on the filesystem (tunefs -t enable), then run
fsck_ffs -E on it before mounting it.

I tested this patch by half-filling an mdconfig'ed filesystem image,
running fsck_ffs -E on it, then verifying that the contents were not
damaged by comparing them to a pristine copy using rsync's checksum
functionality.  There is no reliable way to test it on real hardware.

Many thanks to mckusick@, who provided the tricky parts of this patch and
reviewed the final version.

Reviewed by:	mckusick@
MFC after:	3 weeks
2011-04-29 23:00:23 +00:00
Konstantin Belousov
0947d19a09 In checker, read journal by sectors.
Due to UFS insistence to pretend that device sector size is 512 bytes,
sector size is obtained from ioctl(DIOCGSECTORSIZE) for real devices,
and from the label otherwise. The file images without label have to
be made with 512 sector size.

In collaboration with:	pho
Reviewed by:	jeff
Tested by:	bz, pho
2011-02-12 13:17:14 +00:00
Pawel Jakub Dawidek
d00690aec5 Protect fsck.h from being included twice. 2010-04-24 07:54:49 +00:00
Jeff Roberson
113db2dddb - Merge soft-updates journaling from projects/suj/head into head. This
brings in support for an optional intent log which eliminates the need
   for background fsck on unclean shutdown.

Sponsored by:   iXsystems, Yahoo!, and Juniper.
With help from: McKusick and Peter Holm
2010-04-24 07:05:35 +00:00
Ulf Lilleengen
fd02a3b5c9 - Use volatile for signal variables.
Suggested by:	Jaakko Heinonen <jh -at- saunalahti.fi>
2009-06-02 17:57:24 +00:00
Ulf Lilleengen
a0f163fd83 - Use sig_atomic_t for signal handler variables.
MFC after:	1 week
2009-05-29 20:01:50 +00:00
Kirk McKusick
910b491e7e Update the actions previously attempted by the -D option to make them
robust. With these changes fsck is now able to detect and reliably
rebuild corrupted cylinder group maps. The -D option is no longer
necessary as it has been replaced by a prompt asking whether the
corrupted cylinder group should be rebuilt and doing so when requested.
These actions are only offered and taken when running fsck in manual
mode. Corrupted cylinder groups found during preen mode cause the fsck
to fail.

Add the -r option to free up excess unused inodes. Decreasing the
number of preallocated inodes reduces the running time of future
runs of fsck and frees up space that can allocated to files. The -r
option is ignored when running in preen mode.

Reviewed by: Xin LI <delphij@>
Sponsored by: Rsync.net
2009-02-04 01:02:56 +00:00
David E. O'Brien
111a52201c Add the '-C' "check clean" flag. If the FS is marked clean, skip file
system checking.  However, if the file system is not clean, perform a
full fsck.

Reviewed by:	delphij
Obtained from:	Juniper Networks
2009-01-30 18:33:05 +00:00
Xin LI
7f94ca7233 Rename option 'C' to 'D' (damaged) in order to avoid a conflict with upcoming
Juniper 'C' (clean) flag.

Requested by:	obrien
MFC after:	1 week
2009-01-20 22:49:49 +00:00
Xin LI
14320f1e7f Add a new flag, '-C' which enables a special mode that is intended for
catastrophic recovery.  Currently, this mode only validates whether a
cylindergroup has good signature data, and prompts the user to decide
whether to clear it as a whole.

This mode is useful when there is data damage on a disk and you are
working on copy of the original disk, as fsck_ffs(8) tends to abnormally
exit in such case, as a last resort to recover data from the disk.
2008-04-10 23:49:23 +00:00
Pawel Jakub Dawidek
aef8d2449b Implements gjournal support. If file system has gjournal support enabled
and -p flag was given perform fast file system checking (bascially only
garbage collecting of orphaned objects).

Rename bread() to blread() and bwrite() to blwrite() as we now link to
the libufs library, which also implement functions with that names.

Sponsored by:	home.pl
2006-10-31 22:06:56 +00:00