Commit Graph

76 Commits

Author SHA1 Message Date
Kirk McKusick
345bfec109 Provide cache coherency between getnextinode() and ginode()
The fsck_ffs(8) utility has two subsystems for reading and writing
inodes. The getnextinode() interface is used in Pass 1 (and Pass
1b if needed) to sequentially walk through all the inodes in the
filesystem. The ginode() interface is used to read and write
individual inodes. Pass 1 uses a mix of both interfaces. This
change ensures that ginode() returns a pointer to the inode in the
cache maintained by getnextinode() when that interface holds the
requested inode so that all modifications to the inode are made in
a single place and are all written to the disk together.

Reported by:  Peter Holm
Tested by:    Peter Holm
Sponsored by: The FreeBSD Foundation
2022-08-23 23:48:40 -07:00
Kirk McKusick
6e821c35d6 Correctness cleanups in fsck_ffs(8).
Allocation or I/O failures in fsck_ffs(8) could cause segment
faults because of missing checks or not-yet-initialized data
structures. Correct these issues.

Reported by:  Peter Holm
Sponsored by: The FreeBSD Foundation
2022-08-13 13:28:31 -07:00
Kirk McKusick
262b581d17 Properly specify the level of indirect block being looked up.
The value is used only for diagnostic purposes so no functional
change should result.
2022-05-05 16:58:03 -07:00
Kirk McKusick
4313e2ae44 Avoid lost buffers in fsck_ffs.
The ino_blkatoff() and indir_blkatoff() functions failed to release
the buffers holding second and third level indirect blocks. This
commit ensures that these buffers are now properly released.

MFC after:    1 week
Sponsored by: Netflix
2021-10-07 15:52:58 -07:00
Kirk McKusick
5c9e9eb7a2 Fix fsck_ufs segfault when it needs to rerun.
The segfault was being hit in the rerun of Pass 1 in ginode() when
trying to get an inode that needs to be repaired. When the first run
of fsck_ffs finishes it clears the inode cache, but ginode() was
failing to check properly and tried to access the deallocated cache entry.

Reported by:  Peter Holm
Reviewed by:  Chuck Silvers
Tested by:    Peter Holm and Chuck Silvers
MFC after:    3 days
Sponsored by: Netflix
2021-05-28 19:41:50 -07:00
Kirk McKusick
84a0e3f957 Make fsck_ffs more persistent in creating a lost+found directory.
When fsck_ffs is running in interactive mode and finds unlinked files,
it offers to either unlink them or place them in a lost+found directory.
If the lost+found directory option is requested and no lost+found
directory exists, fsck_ffs offers to create one. When creating one,
it must allocate an inode and a filesystem block. It attempts to
allocate them from the first cylinder group. If the first cylinder
group has a bad check hash, it gives up.

This change expands the search into later cylinder groups when the
first one fails with a bad check hash.

Reported by:  Chuck Silvers
Tested by:    Chuck Silvers
MFC after:    1 week
Sponsored by: Netflix
2021-04-26 16:48:30 -07:00
Kirk McKusick
7848b25edd Fix fsck_ffs -R finds unfixed duplicate block errors when rerunning.
This fixes a long-standing but very obscure bug in fsck_ffs when
it is run with the -R (rerun after unexpected errors).  It only
occurs if fsck_ffs finds duplicate blocks and they are all contained
in inodes that reside in the first block of inodes (typically among
the first 128 inodes).

Rather than use the usual ginode() interface to walk through the
inodes in pass1, there is a special optimized `getnextinode()'
routine for walking through all the inodes. It has its own private
buffer for reading the inode blocks. If pass 1 finds duplicate
blocks it runs pass 1b to find all the inodes that contain these
duplicate blocks. Pass 1b also uses the `getnextinode()' to search
for the inodes with duplicate blocks. Pass 1b stops when all the
duplicate blocks have been found. If all the duplicate blocks are
found in the first block of inodes, then the getnextinode cache
holds this block of bad inodes. The subsequent cleanup of the inodes
in passes 2-5 is done using ginode() which uses the regular fsck_ffs
cache.

When fsck_ffs restarts, pass1() calls setinodebuf() to point at the
first block of inodes. When it calls getnextinode() to get inode
2, getnextino() sees that its private cache already has the first
set of inodes loaded and starts using them. They are of course the
trashed inodes left over from the previous run of pass1b().

The fix is to always invalidate the getnextinode cache when calling
setinodebuf().

Reported by:  Chuck Silvers
Tested by:    Chuck Silvers
MFC after:    3 days
Sponsored by: Netflix
2021-03-24 17:24:41 -07:00
Kirk McKusick
8c22cf9b09 Fix fsck_ffs incorrectly reporting "CANNOT READ BLK: NNNN" errors.
A long-standing bug in Pass 1 of fsck_ffs in which it is reading in
blocks of inodes to check their block pointers. It failed to round
up the size of the read to a disk block size. When disks would
accept 512-byte aligned reads, the bug rarely manifested itself.
But many recent disks will no longer accept 512-byte aligned reads
but require 4096-byte aligned reads, so the failure to properly
round-up read sizes to multiples of 4096 bytes makes the error
much more likely to occur.

Reported by:  Peter Holm and others
Tested by:    Peter Holm and Rozhuk Ivan
MFC after:    3 days
Sponsored by: Netflix
2021-01-26 11:46:38 -08:00
Kirk McKusick
5cc52631b3 Rewrite the disk I/O management system in fsck_ffs(8). Other than
making fsck_ffs(8) run faster, there should be no functional change.

The original fsck_ffs(8) had its own disk I/O management system.
When gjournal(8) was added to FreeBSD 7, code was added to fsck_ffs(8)
to do the necessary gjournal rollback. Rather than use the existing
fsck_ffs(8) disk I/O system, it wrote its own from scratch. Similarly
when journalled soft updates were added in FreeBSD 9, code was added
to fsck_ffs(8) to do the necessary journal rollback. And once again,
rather than using either of the existing fsck_ffs(8) disk I/O
systems, it wrote its own from scratch. Lastly the fsdb(8) utility
uses the fsck_ffs(8) disk I/O management system. In preparation for
making the changes necessary to enable snapshots to be taken when
using journalled soft updates, it was necessary to have a single
disk I/O system used by all the various subsystems in fsck_ffs(8).

This commit merges the functionality required by all the different
subsystems into a single disk I/O system that supports all of their
needs. In so doing it picks up optimizations from each of them
with the results that each of the subsystems does fewer reads and
writes than it did with its own customized I/O system. It also
greatly simplifies making changes to fsck_ffs(8) since everything
goes through a single place. For example the ginode() function
fetches an inode from the disk. When inode check hashes were added,
they previously had to be checked in the code implementing inode
fetch in each of the three different disk I/O systems. Now they
need only be checked in ginode().

Tested by:    Peter Holm
Sponsored by: Netflix
2021-01-07 15:03:15 -08:00
Kirk McKusick
7180f1ab40 Rename pass4check() to freeblock() and move from pass4.c to inode.c.
The new name more accurately describes what it does and the file move
puts it with other similar functions. Done in preparation for future
cleanups. No functional differences intended.

Sponsored by: Netflix
Historic Footnote: my last FreeBSD svn commit
2020-12-18 23:28:27 +00:00
Kirk McKusick
d483391306 Followup to -r344552 in which fsck_ffs checks for a size past the
last allocated block of the file and if that is found, shortens the
file to reference the last allocated block thus avoiding having it
reference a hole at its end.

This update corrects an error where fsck_ffs miscalculated the last
logical block of the file when the file contained a large hole.

Reported by:  Jamie Landeg-Jones
Tested by:    Peter Holm
MFC after:    2 weeks
Sponsored by: Netflix
2019-04-13 13:31:06 +00:00
Kirk McKusick
72ef1cb896 Properly calculate the last used logical block of a file when checking
inodes that reference directories. While here tighten the check for
comparing the last logical block with the end of the file.

Reported by:  Peter Holm
Tested by:    Peter Holm
Sponsored by: Netflix
2019-03-02 21:30:01 +00:00
Kirk McKusick
7bcd1fab5a Ensure that inode updates are properly flushed out during the first
pass of fsck_ffs. Some changes, such as check-hash corrections were
being lost.

Reported by: Michael Tuexen (tuexen@)
Tested by:   Michael Tuexen (tuexen@)
MFC after:   3 days
2019-02-19 20:12:12 +00:00
Kirk McKusick
e155208020 Fsck would find, report, and offer to fix inode check-hash failures.
If requested to fix the inode check-hash it would confirm having done
it, but then fail to make the fix. The same code is used in fsdb which,
unlike fsck, would actually fix the inode check-hash.

The discrepancy occurred because fsck has two ways to fetch inodes.
The inode by number function ginode() and the streaming inode
function getnextinode() used during pass1. Fsdb uses the ginode()
function which correctly does the fix, while fsck first encounters
the bad inode check-hash in pass1 where it is using the getnextinode()
function that failed to make the correction. This patch corrects
the getnextinode() function so that fsck now correctly fixes inodes
with incorrect inode check-hashs.

Reported by:  Gary Jennejohn <gljennjohn@gmail.com>
Sponsored by: Netflix
2018-12-15 17:32:47 +00:00
Kirk McKusick
8f829a5cf0 Continuing efforts to provide hardening of FFS. This change adds a
check hash to the filesystem inodes. Access attempts to files
associated with an inode with an invalid check hash will fail with
EINVAL (Invalid argument). Access is reestablished after an fsck
is run to find and validate the inodes with invalid check-hashes.
This check avoids a class of filesystem panics related to corrupted
inodes. The hash is done using crc32c.

Note this check-hash is for the inode itself and not any of its
indirect blocks. Check-hash validation may be extended to also
cover indirect block pointers, but that will be a separate (and
more costly) feature.

Check hashes are added only to UFS2 and not to UFS1 as UFS1 is
primarily used in embedded systems with small memories and low-powered
processors which need as light-weight a filesystem as possible.

Reviewed by:  kib
Tested by:    Peter Holm
Sponsored by: Netflix
2018-12-11 22:14:37 +00:00
Kirk McKusick
8ebae128be Ensure that cylinder-group check-hashes are properly updated when first
creating them and when correcting them when they are found to be corrupted.

Reported by:  Don Lewis (truckman@)
Sponsored by: Netflix
2018-12-05 06:31:50 +00:00
Kirk McKusick
9fc5d538fc In preparation for adding inode check-hashes, clean up and
document the libufs interface for fetching and storing inodes.
The undocumented getino / putino interface has been replaced
with a new getinode / putinode interface.

Convert the utilities that had been using the undocumented
interface to use the new documented interface.

No functional change (as for now the libufs library does not
do inode check-hashes).

Reviewed by:  kib
Tested by:    Peter Holm
Sponsored by: Netflix
2018-11-13 21:40:56 +00:00
Kirk McKusick
2c288c95d9 In preparation for adding inode check-hashes, change the fsck_ffs
inodirty() function to have a pointer to the inode being dirtied.
No functional change (as for now the parameter is ununsed).

Sponsored by: Netflix
2018-10-31 05:17:53 +00:00
Ed Maste
d8ba45e213 Revert r313780 (UFS_ prefix) 2018-03-17 12:59:55 +00:00
Ed Maste
1e2b9afca9 Prefix UFS symbols with UFS_ to reduce namespace pollution
Followup to r313780.  Also prefix ext2's and nandfs's versions with
EXT2_ and NANDFS_.

Reported by:	kib
Reviewed by:	kib, mckusick
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D9623
2018-03-17 01:48:27 +00:00
Kirk McKusick
12487c7243 Fix a read past the end of a buffer in fsck.
To minimize the time spent scanning all of the directories in pass 2
(Check Pathnames), fsck uses a search order based on the location
of their first block. Zero length directories have no first block,
so the array being used to hold the block numbers of directory
inodes was of zero length. Thus a lookup was done past the end of
the array getting at best a random value and at worst a segment
fault.  For zero length directories, this change allocates a one
element block array and initializes it to zero. The effect is that
all zero length directories are handled first in pass 2.

Reviewed by: brooks
Differential Revision: https://reviews.freebsd.org/D14163
2018-02-21 20:32:23 +00:00
Kirk McKusick
957fc241ec Rename cgget => cglookup to clear name space for new libufs function cgget.
No functional change.
2018-01-17 06:31:21 +00:00
Pedro F. Giffuni
8a16b7a18f General further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
2017-11-20 19:49:47 +00:00
Pedro F. Giffuni
f671769766 fsck_ffs: Unsign some variables and make use of reallocarray(3).
Instead of casting listmax and numdirs to unsigned values just define
them as unsigned and avoid the casts. Use reallocarray(3).

While here, fs_ncg is already unsigned so the cast is unnecessary.

Reviewed by:	mckusick
MFC after:	2 weeks
2017-04-22 14:50:11 +00:00
Warner Losh
fbbd9655e5 Renumber copyright clause 4
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by:	Jan Schaumann <jschauma@stevens.edu>
Pull Request:	https://github.com/freebsd/freebsd/pull/96
2017-02-28 23:42:47 +00:00
Ed Maste
1dc349ab95 prefix UFS symbols with UFS_ to reduce namespace pollution
Specifically:
  ROOTINO -> UFS_ROOTINO
  WINO -> UFS_WINO
  NXADDR -> UFS_NXADDR
  NDADDR -> UFS_NDADDR
  NIADDR -> UFS_NIADDR
  MAXSYMLINKLEN_UFS[12] -> UFS[12]_MAXSYMLINKLEN (for consistency)

Also prefix ext2's and nandfs's NDADDR and NIADDR with EXT2_ and NANDFS_

Reviewed by:	kib, mckusick
Obtained from:	NetBSD
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D9536
2017-02-15 19:50:26 +00:00
Marcelo Araujo
1120faab41 Use MIN/MAX macros from sys/param.h.
MFC after:	2 weeks.
2016-05-02 01:28:21 +00:00
Pedro F. Giffuni
f32d2926b0 sbin: ake use of our rounddown() macro when sys/param.h is available.
No functional change.
2016-05-01 02:24:05 +00:00
Pedro F. Giffuni
7d5e656214 fsck_ffs for pointers replace 0 with NULL.
Found with devel/coccinelle.

Reviewed by:	mckusick
2016-04-12 22:55:47 +00:00
Kirk McKusick
81fbded23f Revert 248634 and 248643 (e.g., restoring 248625 and 248639).
Build verified by: Glen Barber (gjb@)
2013-03-23 20:00:02 +00:00
Sean Bruno
115f80b8d3 Revert svn r248625
Clang errors around printf could be trivially fixed, but the breakage in
sbin/fsdb were to significant for this type of change.

Submitter of this changeset has been notified and hopefully this can be
restored soon.
2013-03-23 04:26:13 +00:00
Kirk McKusick
776816d32b Speed up fsck by caching the cylinder group maps in pass1 so
that they do not need to be read again in pass5. As this nearly
doubles the memory requirement for fsck, the cache is thrown away
if other memory needs in fsck would otherwise fail. Thus, the
memory footprint of fsck remains unchanged in memory constrained
environments.

This work was inspired by a paper presented at Usenix's FAST '13:
www.usenix.org/conference/fast13/ffsck-fast-file-system-checker

Details of this implementation appears in the April 2013 of ;login:
www.usenix.org/publications/login/april-2013-volume-38-number-2.
A copy of the April 2013 ;login: paper can also be downloaded
from: www.mckusick.com/publications/faster_fsck.pdf.

Reviewed by: kib
Tested by:   Peter Holm
MFC after:   4 weeks
2013-03-22 21:50:43 +00:00
Kirk McKusick
ed75b5a156 When running with the -d option, instrument fsck_ffs to track the number,
data type, and running time of its I/O operations.

No functional changes.
2013-02-24 06:44:29 +00:00
Matthew D Fleming
623d7cb663 Fix fsck_ffs build with a 64-bit ino_t.
Original code by:	Gleb Kurtsou
2012-09-27 23:30:58 +00:00
Dag-Erling Smørgrav
d40c066473 Mechanical whitespace cleanup.
MFC after:	3 weeks
2011-04-27 02:55:03 +00:00
Kirk McKusick
7649cb0043 The dump, fsck_ffs, fsdb, fsirand, newfs, makefs, and quot utilities
include sys/time.h instead of time.h. This include is incorrect as
per the manpages for the APIs and the POSIX definitions. This commit
replaces sys/time.h where necessary with time.h.

The commit also includes some minor style(9) header fixup in newfs.

This commit is part of a larger effort by Garrett Cooper started in
//depot/user/gcooper/posix-conformance-work/ -- to make FreeBSD more
POSIX compliant.

Submitted by:  Garrett Cooper   yanegomi at gmail dot com
2011-01-24 06:17:05 +00:00
Kirk McKusick
910b491e7e Update the actions previously attempted by the -D option to make them
robust. With these changes fsck is now able to detect and reliably
rebuild corrupted cylinder group maps. The -D option is no longer
necessary as it has been replaced by a prompt asking whether the
corrupted cylinder group should be rebuilt and doing so when requested.
These actions are only offered and taken when running fsck in manual
mode. Corrupted cylinder groups found during preen mode cause the fsck
to fail.

Add the -r option to free up excess unused inodes. Decreasing the
number of preallocated inodes reduces the running time of future
runs of fsck and frees up space that can allocated to files. The -r
option is ignored when running in preen mode.

Reviewed by: Xin LI <delphij@>
Sponsored by: Rsync.net
2009-02-04 01:02:56 +00:00
Xin LI
14320f1e7f Add a new flag, '-C' which enables a special mode that is intended for
catastrophic recovery.  Currently, this mode only validates whether a
cylindergroup has good signature data, and prompts the user to decide
whether to clear it as a whole.

This mode is useful when there is data damage on a disk and you are
working on copy of the original disk, as fsck_ffs(8) tends to abnormally
exit in such case, as a last resort to recover data from the disk.
2008-04-10 23:49:23 +00:00
Pawel Jakub Dawidek
aef8d2449b Implements gjournal support. If file system has gjournal support enabled
and -p flag was given perform fast file system checking (bascially only
garbage collecting of orphaned objects).

Rename bread() to blread() and bwrite() to blwrite() as we now link to
the libufs library, which also implement functions with that names.

Sponsored by:	home.pl
2006-10-31 22:06:56 +00:00
Don Lewis
af6726e657 Eliminate linked list used to track inodes with an initial link
count of zero and instead encode this information in the inode state.
Pass 4 performed a linear search of this list for each inode in
the file system, which performs poorly if the list is long.

Reviewed by:    sam & keramida (an earlier version of the patch), mckusick
MFC after:	1 month
2004-10-08 20:44:47 +00:00
Scott Long
c3b2344b93 Create DIP_SET() and IBLK_SET() macros to fix lvalue warnings.
Inspired by: kan
2004-09-01 05:48:06 +00:00
Mark Murray
4c723140a4 Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core, imp
2004-04-09 19:58:40 +00:00
David E. O'Brien
c69284ca08 Use __FBSDID() to quiet GCC 3.3 warnings. 2003-05-03 18:41:59 +00:00
Maxime Henrion
84fc0d7e7f Fix a bunch of format string warnings which broke
the sparc64 build.

Tested on:	sparc64, i386
2002-07-31 12:01:14 +00:00
Poul-Henning Kamp
599304a42f Warning cleanup.
Format changes by peter
2002-07-30 13:01:25 +00:00
Kirk McKusick
1c85e6a35d This commit adds basic support for the UFS2 filesystem. The UFS2
filesystem expands the inode to 256 bytes to make space for 64-bit
block pointers. It also adds a file-creation time field, an ability
to use jumbo blocks per inode to allow extent like pointer density,
and space for extended attributes (up to twice the filesystem block
size worth of attributes, e.g., on a 16K filesystem, there is space
for 32K of attributes). UFS2 fully supports and runs existing UFS1
filesystems. New filesystems built using newfs can be built in either
UFS1 or UFS2 format using the -O option. In this commit UFS1 is
the default format, so if you want to build UFS2 format filesystems,
you must specify -O 2. This default will be changed to UFS2 when
UFS2 proves itself to be stable. In this commit the boot code for
reading UFS2 filesystems is not compiled (see /sys/boot/common/ufsread.c)
as there is insufficient space in the boot block. Once the size of the
boot block is increased, this code can be defined.

Things to note: the definition of SBSIZE has changed to SBLOCKSIZE.
The header file <ufs/ufs/dinode.h> must be included before
<ufs/ffs/fs.h> so as to get the definitions of ufs2_daddr_t and
ufs_lbn_t.

Still TODO:
Verify that the first level bootstraps work for all the architectures.
Convert the utility ffsinfo to understand UFS2 and test growfs.
Add support for the extended attribute storage. Update soft updates
to ensure integrity of extended attribute storage. Switch the
current extended attribute interfaces to use the extended attribute
storage. Add the extent like functionality (framework is there,
but is currently never used).

Sponsored by: DARPA & NAI Labs.
Reviewed by:	Poul-Henning Kamp <phk@freebsd.org>
2002-06-21 06:18:05 +00:00
Poul-Henning Kamp
381ee4c2e8 UFS2 preparation commit:
Remove support for converting old FFS formats to newer.

Submitted by:	mckusick
Sponspored by: DARPA & NAI Labs.
2002-05-12 23:44:15 +00:00
Warner Losh
b70cd7ee68 o __P removed
o ansi function prototypes
o unifdef -D__STDC__
o __dead2 on usage prototype
o remove now-bogus main prototype
2002-03-20 22:57:10 +00:00
David E. O'Brien
3d438ad61f Remove 'register' keyword.
It does not help modern compilers, and some may take some hit from it.
(I also found several functions that listed *every* of its 10 local vars with
 "register" -- just how many free registers do people think machines have?)
2002-03-20 17:55:10 +00:00
Alfred Perlstein
a2d440dae0 declare locally used globals as static. 2001-12-22 12:35:03 +00:00