Commit Graph

385 Commits

Author SHA1 Message Date
Gordon Bergling
299fcf402d fsck_ffs(8): Fix a typo in a source code comment
- s/it it/if it/

MFC after:	3 days
2022-04-09 14:38:00 +02:00
Kirk McKusick
2983ec0a87 Ensure that fsck(8) / fsck_ffs(8) produces the correct exit code
for missing devices.

The fsck_ffs(8) utility uses its internal function openfilesys()
when opening a disk to be checked. This change avoids the use
of pfatal() in openfilesys() which always exits with failure (exit
value 8) so that the caller can choose the correct exit value.
In the case of a non-existent device it should exit with value 3
which allows the startup system to wait for drives (such as those
attached by USB) to come online.

Reported by: karels
Tested by:   karels
PR:          262580
MFC after:   3 days
2022-03-16 11:37:15 -07:00
Kirk McKusick
c5d476c98c Update fsdb(8) to reflect new structure of fsck_ffs(8).
The cleanup of fsck_ffs(8) in commit c0bfa109b9 broke fsdb(8).
This commit adds the one-line update needed in fsdb(8) to make it
work with the new fsck_ffs(8) structure.

Reported by: Chuck Silvers
Tested by:   Chuck Silvers
MFC after:   3 days
2022-02-23 15:40:58 -08:00
Kirk McKusick
7a1c1f6a03 Avoid unaligned writes by fsck_ffs(8).
Normally fsck_ffs never does reads or writes that are not aligned
to the size of one of the checked filesystems fragments. The one
exception is when it finds that it needs to write the superblock
recovery information. Here it will write with the alignment reported
by the underlying disk as its sector size as reported by an
ioctl(diskfd, DIOCGSECTORSIZE, &secsize).

Modern disks have a sector size of 4096, but for backward compatibility
with older disks will report that they have a sector size of 512.
When presented with a 512 byte write, they have to read the associated
4096 byte sector, replace the 512 bytes to be written, and write
the updated 4096 byte sector back to the disk. Unfortunately, some
disks report that they have 512 sectors, but fail writes that are not
aligned to 4096 boundaries and are a multiple of 4096 bytes in size.

This commit updates fsck_ffs(8) so that it uses the filesystem fragment
size as the smallest size and alignment for doing writes rather than
the disk's reported sector size.

Reported by:  Andriy Gapon
MFC after:    1 week
2022-02-20 13:21:12 -08:00
Kirk McKusick
9583be047b Properly fix parameter to sysctlnametomib(). 2022-02-04 14:04:12 -08:00
Kirk McKusick
504cb544e2 Fix parameter to sysctlnametomib(); 2022-02-04 14:00:38 -08:00
Kirk McKusick
c0bfa109b9 Have fsck_ffs(8) properly correct superblock check-hash failures.
Part of the problem was that fsck_ffs would read the superblock
multiple times complaining and repairing the superblock check hash
each time and then at the end failing to write out the superblock
with the corrected check hash. This fix reads the superblock just
once and if the check hash is corrected ensures that the fixed
superblock gets written.

Tested by:    Peter Holm
PR:           245916
MFC after:    1 week
Sponsored by: Netflix
2022-02-04 11:47:48 -08:00
Kirk McKusick
c82df0a0bf Whitespace and capitalization cleanups.
No changes intended.

Sponsored by: Netflix
2022-01-05 16:32:48 -08:00
Kirk McKusick
4313e2ae44 Avoid lost buffers in fsck_ffs.
The ino_blkatoff() and indir_blkatoff() functions failed to release
the buffers holding second and third level indirect blocks. This
commit ensures that these buffers are now properly released.

MFC after:    1 week
Sponsored by: Netflix
2021-10-07 15:52:58 -07:00
Kirk McKusick
b31c5a2532 Eliminate an unnecessary rerun request in fsck_ffs.
When fsck_ffs is running in preen mode and finds a zero-length directory,
it deletes that directory. In doing this operation, it unnecessary set
its internal flag saying that fsck_ffs needed to be rerun. This patch
deletes the rerun request for this case.

Reported by:  Mark Johnson
PR:           246962
MFC after:    1 week
Sponsored by: Netflix
2021-09-22 16:20:19 -07:00
Robert Wing
0c5a59252c fsck_ffs: fix background fsck in preen mode
Background checks are only allowed for mounted filesystems - don't try
to open the device for writing when performing a background check.

While here, remove a debugging printf that's commented out.

PR:             256746
Fixes:          5cc52631b3
Reviewed by:	mckusick
MFC After:      1 week
Differential Revision:	https://reviews.freebsd.org/D30880
2021-07-11 12:47:27 -08:00
Alan Somers
3874c0abb0 [skip ci] correct a few SPDX license tags
These were all incorrectly labeled as 2-clause BSD licenses by a
semi-automated process, when in fact they are 3-clause.

Discussed with:	pfg, imp
MFC after:	2 weeks
Sponsored by:	Axcient
2021-07-07 13:52:20 -06:00
Chuck Silvers
ed1a156b03 fsck_ffs: don't try to write in read-only mode
Skip trying to change fs_mtime for SU+J if we are running read-only.

Reviewed by:    mckusick
Sponsored by:	Netflix
2021-06-29 14:29:15 -07:00
Robert Wing
441e69e419 fsck_ufs: fix segfault with gjournal
The segfault was being hit in ckfini() (sbin/fsck_ffs/fsutil.c) while
attempting to traverse the buffer cache. The tail queue used for the
buffer cache was not initialized before dropping into gjournal_check().

Initialize the buffer cache before calling gjournal_check().

PR:             245907
Reviewed by:    jhb, mckusick
MFC after:      1 week
Differential Revision:  https://reviews.freebsd.org/D30537
2021-06-02 18:30:20 -08:00
Robert Wing
9b0f1d64b0 Revert "Fix fsck_ufs segfaults with gjournal (SU+J)"
Fix fsck for 32-bit platforms.

This reverts commit f190f9193b.
2021-05-28 18:59:07 -08:00
Kirk McKusick
5c9e9eb7a2 Fix fsck_ufs segfault when it needs to rerun.
The segfault was being hit in the rerun of Pass 1 in ginode() when
trying to get an inode that needs to be repaired. When the first run
of fsck_ffs finishes it clears the inode cache, but ginode() was
failing to check properly and tried to access the deallocated cache entry.

Reported by:  Peter Holm
Reviewed by:  Chuck Silvers
Tested by:    Peter Holm and Chuck Silvers
MFC after:    3 days
Sponsored by: Netflix
2021-05-28 19:41:50 -07:00
Robert Wing
20123b25ee fsck_ffs(8): fix divide by zero when debug messages are enabled
Only print buffer cache debug message when a cache lookup has been done.

When running `fsck_ffs -d` on a gjournal'ed filesystem, it's possible
that totalreads is greater than zero when no cache lookup has been
done - causing a divide by zero. This commit fixes the following error:

    Floating point exception (core dumped)

Reviewed by:    mckusick
Differential Revision:  https://reviews.freebsd.org/D30370
2021-05-22 11:03:36 -08:00
Kirk McKusick
f190f9193b Fix fsck_ufs segfaults with gjournal (SU+J)
The segfault was being hit in ckfini() (sbin/fsck_ffs/fsutil.c)
while attempting to traverse the buffer cache to flush dirty buffers.
The tail queue used for the buffer cache was not initialized before
dropping into gjournal_check(). Move the buffer initialization earlier
so that it has been done before calling gjournal_check().

Reported by:  crypt47, nvass
Fix by:       Robert Wing
Tested by:    Robert Wing
PR:           255030
PR:           255979
MFC after:    3 days
Sponsored by: Netflix
2021-05-21 13:42:37 -07:00
Kirk McKusick
fe815b88b5 Fix fsck_ffs Pass 1b error exit "bad inode number 256 to nextinode".
Pass 1b of fsck_ffs runs only when Pass 1 has found duplicate blocks.
Pass 1 only knows that a block is duplicate when it finds the second
instance of its use. The role of Pass 1b is to find the first use
of all the duplicate blocks. It makes a pass over the cylinder groups
looking for these blocks. When moving to the next cylinder group,
Pass 1b failed to properly calculate the starting inode number for
the cylinder group resulting in the above error message when it
tried to read the first inode in the cylinder group.

Reported by:  Px
Tested by:    Px
PR:           255979
MFC after:    3 days
Sponsored by: Netflix
2021-05-19 14:39:24 -07:00
Kirk McKusick
a3628327e7 Ensure that files with no allocated blocks are trimmed to zero length.
UFS does not allow files to end with a hole; it requires that the
last block of a file be allocated. As fsck_ffs(8) initially scans
each allocated inode, it tracks the last allocated block in the
inode. It then checks that the inode's size falls in the last
allocated block. If the last allocated block falls before the size,
a `file size beyond end of allocated file' warning is issued and
the file is shortened to reference the last allocated block (to avoid
having it reference a hole at its end). If the last allocated block
falls after the size, a `partially truncated file' warning is issued
and all blocks following the block referenced by the size are freed.

Because of an incorrect unsigned comparison, this test was failing
to handle files with no allocated blocks but non-zero size (which
should have had their size reset to zero). Once that was fixed the
test started incorrectly complaining about short symbolic links
that place the link path in the inode rather than in a disk block.
Because these symbolic links have a non-zero size, but no allocated
blocks, fsck_ffs wanted to zero out their size. This patch has to
detect and avoid changing the size of such symbolic links.

Reported by:  Chuck Silvers
Tested by:    Chuck Silvers
MFC after:    1 week
Sponsored by: Netflix
2021-05-11 14:52:26 -07:00
Kirk McKusick
689724cb23 Clean up fsck_ffs error message output.
When fsck_ffs is creating a lost+found directory it must allocate
an inode and a filesystem block. If it encounters a cylinder group
with a bad check hash, it complains twice: once for the inode and
again for the filesystem block.

This change suppresses the second complaint.

Reported by:  Chuck Silvers
Tested by:    Chuck Silvers
MFC after:    1 week
Sponsored by: Netflix
2021-04-26 18:43:51 -07:00
Kirk McKusick
84a0e3f957 Make fsck_ffs more persistent in creating a lost+found directory.
When fsck_ffs is running in interactive mode and finds unlinked files,
it offers to either unlink them or place them in a lost+found directory.
If the lost+found directory option is requested and no lost+found
directory exists, fsck_ffs offers to create one. When creating one,
it must allocate an inode and a filesystem block. It attempts to
allocate them from the first cylinder group. If the first cylinder
group has a bad check hash, it gives up.

This change expands the search into later cylinder groups when the
first one fails with a bad check hash.

Reported by:  Chuck Silvers
Tested by:    Chuck Silvers
MFC after:    1 week
Sponsored by: Netflix
2021-04-26 16:48:30 -07:00
Kirk McKusick
fc56fd262d Ensure that all allocated data structures in fsck_ffs are freed.
Several large data structures are allocated by fsck_ffs to track
resource usage. Most but not all were deallocated at the end of
checking each filesystem. This commit consolidates the freeing
of all data structures in one place and adds one that had previously
been missing.

It is important to clean up these data structures as they can be
large. If the previous allocations have not been freed, fsck_ffs
can run out of address space when many large filesystems are being
checked. An alternative would be to fork a new instance of fsck_ffs
for each filesystem to be checked, but we choose to free the small
set of large structures to save the fork overhead.

Reported by:  Chuck Silvers
Tested by:    Chuck Silvers
MFC after:    7 days
Sponsored by: Netflix
2021-04-02 11:58:49 -07:00
Kirk McKusick
7848b25edd Fix fsck_ffs -R finds unfixed duplicate block errors when rerunning.
This fixes a long-standing but very obscure bug in fsck_ffs when
it is run with the -R (rerun after unexpected errors).  It only
occurs if fsck_ffs finds duplicate blocks and they are all contained
in inodes that reside in the first block of inodes (typically among
the first 128 inodes).

Rather than use the usual ginode() interface to walk through the
inodes in pass1, there is a special optimized `getnextinode()'
routine for walking through all the inodes. It has its own private
buffer for reading the inode blocks. If pass 1 finds duplicate
blocks it runs pass 1b to find all the inodes that contain these
duplicate blocks. Pass 1b also uses the `getnextinode()' to search
for the inodes with duplicate blocks. Pass 1b stops when all the
duplicate blocks have been found. If all the duplicate blocks are
found in the first block of inodes, then the getnextinode cache
holds this block of bad inodes. The subsequent cleanup of the inodes
in passes 2-5 is done using ginode() which uses the regular fsck_ffs
cache.

When fsck_ffs restarts, pass1() calls setinodebuf() to point at the
first block of inodes. When it calls getnextinode() to get inode
2, getnextino() sees that its private cache already has the first
set of inodes loaded and starts using them. They are of course the
trashed inodes left over from the previous run of pass1b().

The fix is to always invalidate the getnextinode cache when calling
setinodebuf().

Reported by:  Chuck Silvers
Tested by:    Chuck Silvers
MFC after:    3 days
Sponsored by: Netflix
2021-03-24 17:24:41 -07:00
Kirk McKusick
bc444e2ec6 Fix fsck_ffs Pass 1b error exit "bad inode number 2 to nextinode".
Pass 1b of fsck_ffs runs only when Pass 1 has found duplicate blocks.
When starting up, Pass 1b failed to properly skip over the two unused
inodes at the beginning of the filesystem resulting in the above error
message when it tried to read the filesystem root inode.

Reported by:  Chuck Silvers
Tested by:    Chuck Silvers
MFC after:    3 days
Sponsored by: Netflix
2021-03-24 16:53:28 -07:00
Kirk McKusick
6385cabd5b Do not complain about incorrect cylinder group check-hashes when
asked to add them to a filesystem.

MFC after:    3 days
Sponsored by: Netflix
2021-03-11 22:46:15 -08:00
Konstantin Belousov
d485c77f20 Remove #define _KERNEL hacks from libprocstat
Make sys/buf.h, sys/pipe.h, sys/fs/devfs/devfs*.h headers usable in
userspace, assuming that the consumer has an idea what it is for.
Unhide more material from sys/mount.h and sys/ufs/ufs/inode.h,
sys/ufs/ufs/ufsmount.h for consumption of userspace tools, with the
same caveat.

Remove unacceptable hack from usr.sbin/makefs which relied on sys/buf.h
being unusable in userspace, where it override struct buf with its own
definition.  Instead, provide struct m_buf and struct m_vnode and adapt
code to use local variants.

Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D28679
2021-02-21 11:38:21 +02:00
Kirk McKusick
8c22cf9b09 Fix fsck_ffs incorrectly reporting "CANNOT READ BLK: NNNN" errors.
A long-standing bug in Pass 1 of fsck_ffs in which it is reading in
blocks of inodes to check their block pointers. It failed to round
up the size of the read to a disk block size. When disks would
accept 512-byte aligned reads, the bug rarely manifested itself.
But many recent disks will no longer accept 512-byte aligned reads
but require 4096-byte aligned reads, so the failure to properly
round-up read sizes to multiples of 4096 bytes makes the error
much more likely to occur.

Reported by:  Peter Holm and others
Tested by:    Peter Holm and Rozhuk Ivan
MFC after:    3 days
Sponsored by: Netflix
2021-01-26 11:46:38 -08:00
Cy Schubert
c6951fac78 Fix 32-bit build post 5cc52631b3. 2021-01-08 11:28:30 -08:00
Kirk McKusick
5cc52631b3 Rewrite the disk I/O management system in fsck_ffs(8). Other than
making fsck_ffs(8) run faster, there should be no functional change.

The original fsck_ffs(8) had its own disk I/O management system.
When gjournal(8) was added to FreeBSD 7, code was added to fsck_ffs(8)
to do the necessary gjournal rollback. Rather than use the existing
fsck_ffs(8) disk I/O system, it wrote its own from scratch. Similarly
when journalled soft updates were added in FreeBSD 9, code was added
to fsck_ffs(8) to do the necessary journal rollback. And once again,
rather than using either of the existing fsck_ffs(8) disk I/O
systems, it wrote its own from scratch. Lastly the fsdb(8) utility
uses the fsck_ffs(8) disk I/O management system. In preparation for
making the changes necessary to enable snapshots to be taken when
using journalled soft updates, it was necessary to have a single
disk I/O system used by all the various subsystems in fsck_ffs(8).

This commit merges the functionality required by all the different
subsystems into a single disk I/O system that supports all of their
needs. In so doing it picks up optimizations from each of them
with the results that each of the subsystems does fewer reads and
writes than it did with its own customized I/O system. It also
greatly simplifies making changes to fsck_ffs(8) since everything
goes through a single place. For example the ginode() function
fetches an inode from the disk. When inode check hashes were added,
they previously had to be checked in the code implementing inode
fetch in each of the three different disk I/O systems. Now they
need only be checked in ginode().

Tested by:    Peter Holm
Sponsored by: Netflix
2021-01-07 15:03:15 -08:00
Kirk McKusick
c8a7a3ffe1 Fix bug in expanding lost+found direct blocks.
Reported by:  Peter Holm
Sponsored by: Netflix
2021-01-06 16:35:01 -08:00
Kirk McKusick
997f81af43 The fsck_ffs program had previously only been able to expand the size
of its lost+found directory by allocating direct block pointers. The
effect was that it was limited to about 19,000 files. One of Peter Holm's
tests produced a filesystem with about 23,000 lost files which meant
that fsck_ffs was unable to recover it. This update allows lost+found
to be expanded into a single indirect block which allows it to store
up to about 6,573,000 lost files.

Reported by:  Peter Holm
Sponsored by: Netflix
2021-01-02 22:31:55 -08:00
Kirk McKusick
68dc94c7d3 Correct and add some comments.
Sponsored by: Netflix
2020-12-31 15:36:33 -08:00
Kirk McKusick
7180f1ab40 Rename pass4check() to freeblock() and move from pass4.c to inode.c.
The new name more accurately describes what it does and the file move
puts it with other similar functions. Done in preparation for future
cleanups. No functional differences intended.

Sponsored by: Netflix
Historic Footnote: my last FreeBSD svn commit
2020-12-18 23:28:27 +00:00
Kirk McKusick
2d34afcd04 Use proper type (ino_t) for inode numbers to avoid improper sign extention
in the Pass 5 checks. The manifestation was fsck_ffs exiting with this error:

  ** Phase 5 - Check Cyl groups
  fsck_ffs: inoinfo: inumber 18446744071562087424 out of range

The error only manifests itself for filesystems bigger than about 100Tb.

Reported by:  Nikita Grechikhin <ngrechikhin at yandex.ru>
MFC after:    2 weeks
Sponsored by: Netflix
2020-10-25 21:04:07 +00:00
Kirk McKusick
996d40f91d Various new check-hash checks have been added to the UFS filesystem
over various major releases. Superblock check hashes were added for
the 12 release and cylinder-group and inode check hashes will appear
in the 13 release.

When a disk with a UFS filesystem is writably mounted, the kernel
clears the feature flags for anything that it does not support. For
example, if a UFS disk from a 12-stable kernel is mounted on an
11-stable system, the 11-stable kernel will clear the flag in the
filesystem superblock that indicates that superblock check-hashs
are being maintained. Thus if the disk is later moved back to a
12-stable system, the 12-stable system will know to ignore its
incorrect check-hash.

If the only filesystem modification done on the earlier kernel is
to run a utility such as growfs(8) that modifies the superblock but
neither updates the check-hash nor clears the feature flag indicating
that it does not support the check-hash, the disk will fail to mount
if it is moved back to its original newer kernel.

This patch moves the code that clears the filesystem feature flags
from the mount code (ffs_mountfs()) to the code that reads the
superblock (ffs_sbget()). As ffs_sbget() is used by the kernel mount
code and is imported into libufs(3), all the filesystem utilities
will now also clear these flags when they make modifications to the
filesystem.

As suggested by John Baldwin, fsck_ffs(8) has been changed to accept
and repair bad superblock check-hashes rather than refusing to run.
This change allows fsck to recover filesystems that have been impacted
by utilities older than those created after this change and is a
sensible thing to do in any event.

Reported by:  John Baldwin (jhb@)
MFC after:    2 weeks
Sponsored by: Netflix
2020-10-25 00:43:48 +00:00
Kirk McKusick
85ee267a3e Update the libufs cgget() and cgput() interfaces to have a similar
API to the sbget() and sbput() interfaces. Specifically they take
a file descriptor pointer rather than the struct uufsd *disk pointer
used by the libufs cgread() and cgwrite() interfaces. Update fsck_ffs
to use these revised interfaces.

No functional changes intended.

Sponsored by: Netflix
2020-09-19 22:48:30 +00:00
Chuck Silvers
e83370448f Move all of the error prints in readsb() from stderr to stdout.
The only output from fsck that should go to stderr is the usage message.
if setup() fails then exit with EEXIT rather than 0.

Reviewed by:	mckusick
Sponsored by:	Netflix
2020-09-01 18:50:26 +00:00
Kirk McKusick
f644caad88 Use the sbput() function to write alternate superblocks so that
they get a checkhash.

PR:           246983
Sponsored by: Netflix
2020-08-15 21:40:36 +00:00
Kirk McKusick
92c839a156 The libufs library needs to track and free the new fs_si structure
in addition to the fs_csp structure that it references.

PR:           247425
Sponsored by: Netflix
2020-06-23 21:28:26 +00:00
Kirk McKusick
0c08ecdff3 Inode check-hash errors were being reported after system crashes.
Trace the cause down to journalled soft updates recovery code in
fsck failing to recompute the check-hash after updating an inode.

As inode check-hash was first introduced to UFS in FreeBSD 13,
there is no need to MFC this commit.

Reported by:  Chuck Silvers
Sponsored by: Netflix
2020-04-10 23:58:07 +00:00
Kirk McKusick
2a18059670 Add an inode check-hash verification when running the journalled
soft update recovery code with the debugging (-d) option.

As inode check-hash was first introduced to UFS in FreeBSD 13,
there is no need to MFC this commit.

Reported by:  Chuck Silvers
Sponsored by: Netflix
2020-04-10 23:49:34 +00:00
Kyle Evans
c3e9752ea1 fsck_ffs/fsdb: fix -fno-common build
This one is also a small list:

- 3x duplicate definition (ufs2_zino, returntosingle, nflag)
- 5x 'needs extern', 3/5 of which are referenced in fsdb

-fno-common will become the default in GCC10/LLVM11.

MFC after:	1 week
2020-03-29 20:03:46 +00:00
Kirk McKusick
c094263a24 When running fsck_ffs manually, do not ask:
USE JOURNAL? [yn]

when the journal timestamp does not match the filesystem mount time
as we are just going to print an error and fall through to a full fsck.
Instead, just run a full fsck.

Requested by: Bjoern A. Zeeb (bz)
MFC after:    7 days
2019-12-24 23:03:12 +00:00
Xin LI
e87e283ce3 Remove unused includes.
MFC after:	2 weeks
2019-12-22 05:44:29 +00:00
Eric van Gyzen
15da40b0af fsck_ffs: fix some memory leaks found by Coverity.
Reported by:	Coverity
CID:		1380549 1380550 1380551
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2019-12-10 20:04:08 +00:00
Kirk McKusick
e39c92986d Replace an uninitialized variable with the correct element from the
superblock when doing recovery with journalled soft updates.

Reported by:  Chuck Silvers
MFC after:    3 days
Sponsored by: Netflix
2019-10-22 22:23:59 +00:00
Kirk McKusick
47d3e2f83b Correct the location of the first backup superblock in fsck_ffs.8.
Make a note in the newfs.8 manual page to update the first backup
superblock location when changing the default fragment size for
the filesystem.

Reported by:  O. Hartmann
2019-08-07 16:56:00 +00:00
Kirk McKusick
967d9fa3bb Treat any inode with bad content as unknown (i.e., ask if it should
be cleared).

Sponsored by: Netflix
2019-07-20 21:39:32 +00:00
Kirk McKusick
3bd88193c6 When running with journaled soft updates, some updated inodes were not
having their check hashes recomputed which resulted in spurious inode
check-hash errors when the system came back up after a crash.

Reported by:  Alan Somers
Sponsored by: Netflix
2019-07-20 21:20:40 +00:00