systems less than 1 TB, due to using 32-bits integers for file system block
numbers. This also causes incorrect error reporting for foreground fsck.
Convert it to use ufs2_daddr_t for block numbers.
PR: kern/127951
Submitted by: tegge
MFC after: 1 week
userspace to kernel via nmount(), pass in the strings
"update", "snapshot", "reload".
We want to move away from passing MNT_ flags from userspace -> kernel
via nmount(), and instead favor passing the string options.
catastrophic recovery. Currently, this mode only validates whether a
cylindergroup has good signature data, and prompts the user to decide
whether to clear it as a whole.
This mode is useful when there is data damage on a disk and you are
working on copy of the original disk, as fsck_ffs(8) tends to abnormally
exit in such case, as a last resort to recover data from the disk.
doing the MNT_RELOAD, pass in "ro" and "update"
string mount options to nmount() instead of MNT_RDONLY and MNT_UPDATE flags.
Due to the complexity of the mount parsing code especially
with respect to the root file system, passing in MNT_RDONLY and MNT_UPDATE
flags would do weird things and would cause fsck to convert the root
file system from a read-only mount to read-write.
To test:
- boot into single user mode
- show mounted file systems with: mount
- root file system should be mounted read-only
- fsck /
- show mounted file systems with: mount
- root file system should still be mounted read-only
PR: 120319
MFC after: 1 month
Reported by: yar
number read from cylinder group. Chances that we read a smarshed
cylinder group, and we can not 100% trust information it has
supplied. fsck_ffs(8) will crash otherwise for some cases.
processing the information. chk1 is more prone to crash when insane
information is provided by the on-disk inode, and does not even work
if the inode is being smarshed badly.
whether fs_bsize is larger than MINBSIZE, which is larger than the
value that is used to compared with fs_bsize, the sizeof fs, so the
check followed, will be always true.
By inspecting the code and some old commit log, I believe that the
check must be that *fs_sbsize* is larger than sizeof fs. We round
up the size to nearest dev_bsize, as the smallest accepted fs_sbsize,
personally, I think this can be even changed to equal, because this
number is mostly an invariant in file systems.
With this check, fsck_ffs(8) will be more picky and has better
chance rejecting bad first superblock rather than referring to bad
value it supplied, thus gives better chance for it to check the
filesystem carefully.
read-only, so we can't simply exit right after calling gjournal_check(),
instead we need to ask about super block reload.
Submitted by: Niki Denev <niki@totalterror.net>
PR: misc/113889
Approved by: re (kensmith)
and -p flag was given perform fast file system checking (bascially only
garbage collecting of orphaned objects).
Rename bread() to blread() and bwrite() to blwrite() as we now link to
the libufs library, which also implement functions with that names.
Sponsored by: home.pl
initializing the sysctl mibs data before actually using them.
The original patchset (which is the actual version that is running
on my testboxes) have checked whether all of these sysctls and
refuses to do background fsck if we don't have them. Kirk has
pointed out that refusing running fsck on old kernels is pointless,
as old kernels will recompute the summary at mount time, so I
have removed these checks.
Unfortunatelly, as the checks will initialize the mib values of
those sysctl's, and which are vital for the runtime summary
adjustment to work, we can not simply remove the check, which
will lead to problem when running background fsck over a dirty
volume. Add these checks in a different way: give a warning rather
than refusing to work, and complain if the functionality is not
available when adjustments are necessary.
Noticed by: A power failure at my lab
Pointy hat: me
MFC After: 3 days
very slow process, especially for large file systems that is just
recovered from a crash.
Since the summary is already re-sync'ed every 30 second, we will
not lag behind too much after a crash. With this consideration
in mind, it is more reasonable to transfer the responsibility to
background fsck, to reduce the delay after a crash.
Add a new sysctl variable, vfs.ffs.compute_summary_at_mount, to
control this behavior. When set to nonzero, we will get the
"old" behavior, that the summary is computed immediately at mount
time.
Add five new sysctl variables to adjust ndir, nbfree, nifree,
nffree and numclusters respectively. Teach fsck_ffs about these
API, however, intentionally not to check the existence, since
kernels without these sysctls must have recomputed the summary
and hence no adjustments are necessary.
This change has eliminated the usual tens of minutes of delay of
mounting large dirty volumes.
Reviewed by: mckusick
MFC After: 1 week
count of zero and instead encode this information in the inode state.
Pass 4 performed a linear search of this list for each inode in
the file system, which performs poorly if the list is long.
Reviewed by: sam & keramida (an earlier version of the patch), mckusick
MFC after: 1 month
has only been partly initialized via newfs(8) so that it applies to both
UFS1 and UFS2.
Submitted by: "Xin LI" delphij at frontfree dot net
MFC: maybe?
shuffles the timing and sleep calls in bgfsck from:
sleep timer_on io timer_off io io io io io io io
to
sleep io io io io io io io timer_on io timer_off
The original method basically guaranteed that the timed I/O included a
disk seek every time, which made bgfsck sleep for much longer than
necessary.
Submitted by: Dan Nelson
Reviewed by: kirk
original intention of the less restrictive permissions was to allow
users to move or delete recovered files that they own. However, it
is better to not create world-writable directories by default; the
administrator can always pre-create lost+found if different permissions
are desired.
Reviewed by: mckusick
filesystem that is checked in background. Create the snapshot in this
directory rather than in the root. There are two benefits:
1) For terabyte-sized filesystems, the snapshot may require many
minutes to build. Although the filesystem will not be suspended
during most of the snapshot build, the snapshot file itself is
locked during the entire snapshot build period. Thus, if it is
accessed during the period that it is being built, the process
trying to access it will block holding its containing directory
locked. If the snapshot is in the root, the root will lock and
the system will come to a halt until the snapshot finishes. By
putting the snapshot in a subdirectory, it is out of the likely
path of any process traversing through the root and hence much
less likely to cause a lock race to the root.
2) The dump program is usually run by a non-root user running with
operator group privilege. Such a user is typically not permitted
to create files in the root of a filesystem. By having a directory
in group operator with group write access available, such a user
will be able to create a snapshot there. Having the dump program
create its snapshot in a subdirectory below the root will benefit
from point (1) as well.
Sponsored by: DARPA & NAI Labs.
bandwidth for other processes. Since the sleeping is done from
userland, this avoids the locking issues that affected the kernel
version.
The algorithm used here is to measure a moving average of the times
taken by a sample of read operations and then delay 1 in 8 reads
by 16 times the measured average. This should correspond to a factor
of 3 slowdown, but in practice the factor is larger (3.5 to 4) due
to hz rounding effects.
Reviewed by: mckusick
Approved by: re
trying to use them. Set a minimum value for numdirs when using an
alternate superblock to avoid spurious numdirs == 0 error. Calculate
new fields when using an alternate superblock from a UFS1 filesystem
to avoid segment faulting.
Sponsored by: DARPA & NAI Labs.
the old 8-bit fs_old_flags to the new location the first time that the
filesystem is mounted by a new kernel. One of the unused flags in
fs_old_flags is used to indicate that the flags have been moved.
Leave the fs_old_flags word intact so that it will work properly if
used on an old kernel.
Change the fs_sblockloc superblock location field to be in units
of bytes instead of in units of filesystem fragments. The old units
did not work properly when the fragment size exceeeded the superblock
size (8192). Update old fs_sblockloc values at the same time that
the flags are moved.
Suggested by: BOUWSMA Barry <freebsd-misuser@netscum.dyndns.dk>
Sponsored by: DARPA & NAI Labs.
It seems a common corruption to have them -ve (I've seen it several times)
and if fsck doesn't fix it, it leads to a kernel pagefault.
Reviewd by: kirk
Submitted by: Eric Jacobs <eaja@erols.com> and me independently.
MFC in: 2 days
PR: bin/40967
Approved by: re
fsck_ffs did not need it, but quotacheck did include it from fsck_ffs.
A repocopy has now moved the fsck_ffs/preen.c file to quotacheck/preen.c
quotacheck and fsck should probably use the same checkfstab() function
and it should possibly live in libufs.
Trouble is: they have diverged in the meantime.
At least now fsck_ffs is not in the equation anymore.
Sponsored by: DARPA & NAI Labs.
UFS2 commit.
These bits in essence made any instance of "softupdates expected
corrution", (ie blocks marked allocated but not referenced by an
inode etc) result in a exit value for fsck_ffs of 2.
2 is part of the magic and appearantly undocumented protocol between
fsck_FOO and fsck and means "dump into single user mode ASAP.
Sponsored by: DARPA & NAI Labs.
imposed by the filesystem structure itself remains. With 16k blocks,
the maximum file size is now just over 128TB.
For now, the UFS1 file size limit is left unchanged so as to remain
consistent with RELENG_4, but it too could be removed in the future.
Reviewed by: mckusick
filesystem expands the inode to 256 bytes to make space for 64-bit
block pointers. It also adds a file-creation time field, an ability
to use jumbo blocks per inode to allow extent like pointer density,
and space for extended attributes (up to twice the filesystem block
size worth of attributes, e.g., on a 16K filesystem, there is space
for 32K of attributes). UFS2 fully supports and runs existing UFS1
filesystems. New filesystems built using newfs can be built in either
UFS1 or UFS2 format using the -O option. In this commit UFS1 is
the default format, so if you want to build UFS2 format filesystems,
you must specify -O 2. This default will be changed to UFS2 when
UFS2 proves itself to be stable. In this commit the boot code for
reading UFS2 filesystems is not compiled (see /sys/boot/common/ufsread.c)
as there is insufficient space in the boot block. Once the size of the
boot block is increased, this code can be defined.
Things to note: the definition of SBSIZE has changed to SBLOCKSIZE.
The header file <ufs/ufs/dinode.h> must be included before
<ufs/ffs/fs.h> so as to get the definitions of ufs2_daddr_t and
ufs_lbn_t.
Still TODO:
Verify that the first level bootstraps work for all the architectures.
Convert the utility ffsinfo to understand UFS2 and test growfs.
Add support for the extended attribute storage. Update soft updates
to ensure integrity of extended attribute storage. Switch the
current extended attribute interfaces to use the extended attribute
storage. Add the extent like functionality (framework is there,
but is currently never used).
Sponsored by: DARPA & NAI Labs.
Reviewed by: Poul-Henning Kamp <phk@freebsd.org>
that might have changed, then did a byte-by-byte comparison with
the alternate. If any unused fields got used, they had to be added
to the exception list. Such changes caused too many false alarms.
So, I have changed the comparison algorithm to compare a selected
set of fields that are not expected to change. This new algorithm
causes far fewer false hits and still does a good job of detecting
problems when they have really occurred. In particular, this change
should ease the transition to kernels supporting UFS2 which make
some significant changes to the superblock.
Sponsored by: DARPA, NAI Labs
It does not help modern compilers, and some may take some hit from it.
(I also found several functions that listed *every* of its 10 local vars with
"register" -- just how many free registers do people think machines have?)
These were mainly missing casts or wrong format strings in printf
statements, but there were also missing includes, unused variables,
functions and arguments.
The choice of `long' vs `int' still seems almost random in a lot
of places though.
directory is encountered. This includes the full path of the
directory that will be removed if the user answers "y" to the
"REMOVE?" question.
PR: bin/226851
Submitted by: KOIE Hide <hide@koie.org>
MFC after: 1 week
when comparing with the alternate superblock. These fields are used
for temporary in-core information only. This should fix the "VALUES
IN SUPER BLOCK DISAGREE WITH THOSE IN FIRST ALTERNATE" error from
fsck_ffs that has been seen a lot recently.
filesystem needs foreground checking (usually at boot time) or
can defer to background checking (after the system is up and running).
See the manual page, fsck_ffs(8), for details on the -F and -B options.
These options are primarily intended for use by the fsck front end.
All output is directed to stdout so that the output is coherent
when redirected to a file or a pipe. Unify the code with the fsck
front end that allows either a device or a mount point to be
specified as the argument to be checked.
1) Set the FS_NEEDSFSCK flag when unexpected problems are encountered.
2) Clear the FS_NEEDSFSCK flag after a successful foreground cleanup.
3) Refuse to run in background when the FS_NEEDSFSCK flag is set.
4) Avoid taking and removing a snapshot when the filesystem is already clean.
5) Properly implement the force cleaning (-f) flag when in preen mode.
Note that you need to have revision 1.21 (date: 2001/04/14 05:26:28) of
fs.h installed in <ufs/ffs/fs.h> defining FS_NEEDSFSCK for this to compile.
affect current systems until fsck is modified to use these new
facilities. To try out this change, set the fsck passno to zero
in /etc/fstab to cause the filesystem to be mounted without running
fsck, then run `fsck_ffs -p -B <filesystem>' after the system has
been brought up multiuser to run a background cleanup on <filesystem>.
Note that the <filesystem> in question must have soft updates enabled.
field, so it was possible for a filesystem marked clean by fsck_ffs
to cause kernel crashes later when mounted. This could occur when
fsck_ffs was used to repair a badly corrupted filesystem.
As pointed out by bde, it is not sufficient to restrict di_size to
just the superblock fs_maxfilesize limit. The use of 32-bit logical
block numbers (both in fsck and the kernel) induces another file
size limit which is usually lower than fs_maxfilesize. Also, the
old 4.3BSD filesystem does not have fs_maxfilesize initialised.
Following this change, fsck_ffs will enforce exactly the same
file size limits as are used by the kernel.
PR: kern/15065
Discussed with: bde
Reviewed by: bde, mckusick
in-core pointers to summary information. An array in this region
(fs_csp) could overflow on filesystems with a very large number of
cylinder groups (~16000 on i386 with 8k blocks). When this happens,
other fields in the superblock get corrupted, and fsck refuses to
check the filesystem.
Solve this problem by replacing the fs_csp array in 'struct fs'
with a single pointer, and add padding to keep the length of the
128-byte region fixed. Update the kernel and userland utilities
to use just this single pointer.
With this change, the kernel no longer makes use of the superblock
fields 'fs_csshift' and 'fs_csmask'. Add a comment to newfs/mkfs.c
to indicate that these fields must be calculated for compatibility
with older kernels.
Reviewed by: mckusick
a SIGINFO (normally via Ctrl-T), a line will be output indicating
the current phase number and progress information relevant to the
current phase.
Approved by: mckusick
utilities which use bits of fsck_ffs - namely quotacheck and fsdb.
In depth, utilities.c contains blockcheck() which is needed by both,
but also a slew of routines which require bits of the FFS code to be
compiled in. This breaks the fs-specific and non-fs-specific code
up into two files (well, blockcheck() is the only routine in utilities.c,
that'll change later) which makes building fsck_ffs, quotacheck and
fsdb work yet again.
(You won't find commits to fsdb and quotacheck here before I haven't
committed the post-fsck-wrappers version of them yet.)
Approved by: rwatson
Obtained from: NetBSD-current source tree
The beginnings of the fsck wrappers stuff from NetBSD. This particular commit
brings a newly repo-copied sbin/fsck_ffs/ (from sbin/fsck/) into fsck wrappers
mode.
A quick overview (the code reflects this):
* Documentation changed to reflect fsck_ffs instead of fsck
* Simply acts on a single filesystem, doesn't try to do any multiple filesystem
magic - this is done by the fsck wrappers now
And then specific to fsck_ffs:
* link to /sbin/fsck_4.2bsd and /sbin/fsck_ufs. This is because right now
the filesystem is of type ufs not ffs, and that during autodetection the
labeltype rather than the VFS type is used - this is because when doing
an autodetection of filesystem type in the fsck wrapper program, it does
not have any link between label type (4.2bsd, vinum, etc) and VFS string.
Note that this shouldn't break a build since the required buildworld Makefile
magic and import of the fsck wrapper code into src/sbin/fsck/ will happen
in a seperate commit.
which sets the inoinfo's i_parent and i_dotdot to 0, but they never get
set to ROOTINO. This means that propagate will never find lost+found and
its descendents, subdirectories will remain DSTATE (instead of DFOUND)
even though they *are* correctly linked in, and pass4.c will try to
clear them unsuccessfully, thinking that there is no link count from the
DSTATE directory's parent. The result is that you need to run fsck twice
and get link count increasing errors (which are unexpected and fatal
when running in preen mode). The fix is to set i_parent and i_dotdot to
"parent" after the second cacheino() call in dir.c:allocdir().
Obtained from: "Ethan Solomita" <ethan@geocast.com> (of the NetBSD Project)
all have zero length. A non-zero length panic's the kernel when one
of these is deleted.
PR: 19426
Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
Reviewed by: dwmalone@FreeBSD.org
effect on operation of fsck on filesystems without snapshots.
If you get compilation errors, be sure that you have copies of
/usr/include/sys/mount.h (1.94), /usr/include/sys/stat.h (1.21),
and /usr/include/ufs/ffs/fs.h (1.16) as of July 4, 2000 or later.
DIR I=64512 CONNECTED. PARENT WAS I=4032
fsck: cannot find inode 995904
fsdb found the inodes with no problem:
fsdb (inum: 64512)> inode 995904
current inode: directory
I=995904 MODE=40777 SIZE=512
MTIME=Feb 14 15:27:07 2000 [0 nsec]
CTIME=Feb 14 15:27:07 2000 [0 nsec]
ATIME=Feb 24 10:31:58 2000 [0 nsec]
OWNER=nobody GRP=nobody LINKCNT=4 FLAGS=0 BLKCNT=2 GEN=38a41386
Direct blocks: 8094568 0 0 0 0 0 0 0 0 0 0 0
Indirect blocks: 0 0 0
The problem turns out to be a program logic error in fsck. It stores
directory inodes internally in hash lists, using the number of
directories to form the hash key:
inpp = &inphead[inumber % numdirs];
Elsewhere, however, it increments numdirs when it finds unattached
directories. I've made the following fix, which solved the problem in
the case in hand.
Submitted by: Greg Lehey <grog@lemis.com>
Reviewed by: Matthew Dillon <dillon@apollo.backplane.com>
Approved by: Kirk McKusick <mckusick@mckusick.com>
Also, in addition to the previous log message, the last change had a fix
for the case where where f.mntfromname is a relative path like da0a.
Submitted by: bde
- Don't use realpath as stat does the right thing.
- Only check ufs filesystems in getmntpt.
- Dont' bother checking that the ufs-mounted-on
device is a special file. It *must* be a special
file, or ufs wouldn't have mounted it.
Submitted by: Paul Saab <ps@yahoo-inc.com>
Submitted by: Kirk McKusick <mckusick@McKusick.COM>
Obtained from: Mckusick, BSDI and a host of others
This exactly matches Kirks sources imported under the
Tag MCKUSICK2. These are as supplied by kirk with one small
change needed to compile under freeBSD.
Some FreeBSD patches will be added back, though many have been
added to Kirk's sources already.
which caused the reference count of a directory to get doubly
decremented.
PR: bin/8030
Reviewed by: nate
Submitted by: Don Lewis <Don.Lewis@tsc.tdk.com>
bug was the cause of the 'freeing free frag' panics that people have been
seeing with FreeBSD/alpha. I have a similar patch to newfs but I've not
finished testing it.
support, which need a final "\n". I only observed one line of
mangled output, but I think there is another one which suffers
from the same problem, and thus I provide a patch that covers
both.
PR: 7483
Reviewed by: phk
Submitted by: Stefan Esser <se@FreeBSD.org>
that `fsck -p' doesn't check multiple slices on the same drive
concurrently. Don't invoke undefined behaviour when searching for
the drive number in strange device names.
PR: 6129
Reviewed by: phk
Submitted by: Yuichi MATSUTAKA <matutaka@osa.att.ne.jp>, but rewritten
by me.
superblock is invalid, fsck looks at the label to help guess where
the next superblock should be. If the partition type is 4.2BSD,
fsck assumed that the block size was valid and divided by it, so
it dumped core if the size was 0.
Initialization of the label was broken almost 3 years ago in rev.1.9
of newfs/newfs.c. Newfs does not change the label at all, so there
is no problem (except the breakage of the automatic search for
backup superblocks) unless something else sets the partition type
to 4.2BSD. However, it is too easy to set partition types to
4.2.BSD by copying an old label or by using a disktab entry to
create the label.
PR: 2537
floating point better in the percentage calculation there to avoid
overflow when there are more than about 20 million fragments. Start
using floating point in the other percentage calculation to avoid
overflow when there are more than about 2 million fragments.
Fixed printf format strings.
Converted sccsid to rcsid.
when there isn't even a filesystem. Attempting to print them tended
to cause SIGSEGV or SIGFPE depending on how far setup() got before it
returned 0. This was broken in the previous revision by removing a
return statement that the previous case depended on falling into.
PR: 4840 (fixed by this commit)
PR: 2537 (possibly fixed by Lite2 merge and later changes. setup()
does more checking now)
something closer to how we used to do it. The Lite2 way is to check the
"fsclean" flag in the superblock and stop there if so (during preen).
We now do the various superblock sanity checks that we used to do before
since it's cheap. We now get the filesystem state summary again instead
of "FILESYSTEM CLEAN; CHECKING SKIPPED" (or whatever).
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
for gcc >= 2.5 and no-ops for gcc >= 2.6. Converted to use __dead2
or __pure2 where it wasn't already done, except in math.h where use
of __pure was mostly wrong.
Subject: Fix for annoying fsck bug
Date: Wed, 24 Jan 1996 13:33:29 -0700 (MST)
The following small diff fixes the annoying fsck bug that causes it to
need to be run twice to end up with correct reference counts for inodes
for directories that had subdirectories relocated into the lost+found
directory.
I found the need to rerun *extremely* annoying. This fix causes the
count to be correctly adjusted later in pass 4 by correctly stating
the parent reference count.
Note that the parent reference count is incremented when the directory
entry is made (for ".."), but is not really there in the case of a
directory that does not make an entry in its parent dir.
This can be tested by waiting for the inode sync after cd'ing from a
shell into a test fs. Then you "mkdir xxx yyy zzz", wait a second,
and hit the machine reset button.
Reviewed by: nate (Tested lots of crashes :)
Submitted by: Terry Lambert <terry@lambert.org>
1) dir.c: get byte order right in mkentry()
2) pass1.c: When doing -c2 conversion, do secsize reads for a symlink -
not doing so was causing the conversion to fail because the device
driver can't deal with short reads.
preen (-p), and in that case the filesystem is skipped if it is clean.
A new flag "-f" for 'force' has been added which basically gives back
the old behavior of checking all the filesystems all the time. This
very closely models the behavior of SunOS and Ultrix.