Commit Graph

118 Commits

Author SHA1 Message Date
Kirk McKusick
75e3597abb Continuing efforts to provide hardening of FFS, this change adds a
check hash to cylinder groups. If a check hash fails when a cylinder
group is read, no further allocations are attempted in that cylinder
group until it has been fixed by fsck. This avoids a class of
filesystem panics related to corrupted cylinder group maps. The
hash is done using crc32c.

Check hases are added only to UFS2 and not to UFS1 as UFS1 is primarily
used in embedded systems with small memories and low-powered processors
which need as light-weight a filesystem as possible.

Specifics of the changes:

sys/sys/buf.h:
    Add BX_FSPRIV to reserve a set of eight b_xflags that may be used
    by individual filesystems for their own purpose. Their specific
    definitions are found in the header files for each filesystem
    that uses them. Also add fields to struct buf as noted below.

sys/kern/vfs_bio.c:
    It is only necessary to compute a check hash for a cylinder
    group when it is actually read from disk. When calling bread,
    you do not know whether the buffer was found in the cache or
    read. So a new flag (GB_CKHASH) and a pointer to a function to
    perform the hash has been added to breadn_flags to say that the
    function should be called to calculate a hash if the data has
    been read. The check hash is placed in b_ckhash and the B_CKHASH
    flag is set to indicate that a read was done and a check hash
    calculated. Though a rather elaborate mechanism, it should
    also work for check hashing other metadata in the future. A
    kernel internal API change was to change breada into a static
    fucntion and add flags and a function pointer to a check-hash
    function.

sys/ufs/ffs/fs.h:
    Add flags for types of check hashes; stored in a new word in the
    superblock. Define corresponding BX_ flags for the different types
    of check hashes. Add a check hash word in the cylinder group.

sys/ufs/ffs/ffs_alloc.c:
    In ffs_getcg do the dance with breadn_flags to get a check hash and
    if one is provided, check it.

sys/ufs/ffs/ffs_vfsops.c:
    Copy across the BX_FFSTYPES flags in background writes.
    Update the check hash when writing out buffers that need them.

sys/ufs/ffs/ffs_snapshot.c:
    Recompute check hash when updating snapshot cylinder groups.

sys/libkern/crc32.c:
lib/libufs/Makefile:
lib/libufs/libufs.h:
lib/libufs/cgroup.c:
    Include libkern/crc32.c in libufs and use it to compute check
    hashes when updating cylinder groups.

Four utilities are affected:

sbin/newfs/mkfs.c:
    Add the check hashes when building the cylinder groups.

sbin/fsck_ffs/fsck.h:
sbin/fsck_ffs/fsutil.c:
    Verify and update check hashes when checking and writing cylinder groups.

sbin/fsck_ffs/pass5.c:
    Offer to add check hashes to existing filesystems.
    Precompute check hashes when rebuilding cylinder group
    (although this will be done when it is written in fsutil.c
    it is necessary to do it early before comparing with the old
    cylinder group)

sbin/dumpfs/dumpfs.c
    Print out the new check hash flag(s)

sbin/fsdb/Makefile:
    Needs to add libufs now used by pass5.c imported from fsck_ffs.

Reviewed by: kib
Tested by: Peter Holm (pho)
2017-09-22 12:45:15 +00:00
Kirk McKusick
855662c611 The new fsck recovery information to enable it to find backup
superblocks created in revision 322297 only works on disks
with sector sizes up to 4K. This update allows the recovery
information to be created by newfs and used by fsck on disks
with sector sizes up to 64K. Note that FFS currently limits
filesystem to be mounted from disks with up to 8K sectors.
Expanding this limitation will be the subject of another
commit.

Reported by: Peter Holm
Reviewed with: kib
2017-09-04 20:19:36 +00:00
Kirk McKusick
77b63aa0fc Since the switch to GPT disk labels, fsck for UFS/FFS has been
unable to automatically find alternate superblocks. This checkin
places the information needed to find alternate superblocks to the
end of the area reserved for the boot block.

Filesystems created with a newfs of this vintage or later will
create the recovery information. If you have a filesystem created
prior to this change and wish to have a recovery block created for
your filesystem, you can do so by running fsck in forground mode
(i.e., do not use the -p or -y options). As it starts, fsck will
ask ``SAVE DATA TO FIND ALTERNATE SUPERBLOCKS'' to which you should
answer yes.

Discussed with: kib, imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D11589
2017-08-09 05:17:21 +00:00
Warner Losh
fbbd9655e5 Renumber copyright clause 4
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by:	Jan Schaumann <jschauma@stevens.edu>
Pull Request:	https://github.com/freebsd/freebsd/pull/96
2017-02-28 23:42:47 +00:00
Ed Maste
1dc349ab95 prefix UFS symbols with UFS_ to reduce namespace pollution
Specifically:
  ROOTINO -> UFS_ROOTINO
  WINO -> UFS_WINO
  NXADDR -> UFS_NXADDR
  NDADDR -> UFS_NDADDR
  NIADDR -> UFS_NIADDR
  MAXSYMLINKLEN_UFS[12] -> UFS[12]_MAXSYMLINKLEN (for consistency)

Also prefix ext2's and nandfs's NDADDR and NIADDR with EXT2_ and NANDFS_

Reviewed by:	kib, mckusick
Obtained from:	NetBSD
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D9536
2017-02-15 19:50:26 +00:00
Marcelo Araujo
0193043c46 Use MIN()/MAX() macros from sys/param.h.
Reviewed by:	trasz
MFC after:	2 weeks.
Differential Revision:	https://reviews.freebsd.org/D6118
2016-05-02 00:45:46 +00:00
Christian Brueffer
295a5bd78c Refer newfs and growfs users to fsck_ffs instead of
fsck, the latter does not accept the referred to "-b" flag.

This change was accidently committed directly to 9-STABLE in
r237505.

PR:		82720
Submitted by:	David D.W. Downey
MFC after:	1 week
2014-02-09 14:28:47 +00:00
Xin LI
9da19cd746 Don't call arc4random_stir() explicitly. To quote arc4random(3)
manual page:

    There is no need to call arc4random_stir() before using
    arc4random() functions family, since they automatically
    initialize themselves.

No objection:	des
MFC after:	2 weeks
2013-10-29 17:34:15 +00:00
Kirk McKusick
baa12a84a7 The purpose of this change to the FFS layout policy is to reduce the
running time for a full fsck. It also reduces the random access time
for large files and speeds the traversal time for directory tree walks.

The key idea is to reserve a small area in each cylinder group
immediately following the inode blocks for the use of metadata,
specifically indirect blocks and directory contents. The new policy
is to preferentially place metadata in the metadata area and
everything else in the blocks that follow the metadata area.

The size of this area can be set when creating a filesystem using
newfs(8) or changed in an existing filesystem using tunefs(8).
Both utilities use the `-k held-for-metadata-blocks' option to
specify the amount of space to be held for metadata blocks in each
cylinder group. By default, newfs(8) sets this area to half of
minfree (typically 4% of the data area).

This work was inspired by a paper presented at Usenix's FAST '13:
www.usenix.org/conference/fast13/ffsck-fast-file-system-checker

Details of this implementation appears in the April 2013 of ;login:
www.usenix.org/publications/login/april-2013-volume-38-number-2.
A copy of the April 2013 ;login: paper can also be downloaded
from: www.mckusick.com/publications/faster_fsck.pdf.

Reviewed by: kib
Tested by:   Peter Holm
MFC after:   4 weeks
2013-03-22 21:45:28 +00:00
Edward Tomasz Napierala
549f62fa42 Fix problem with geom_label(4) not recognizing UFS labels on filesystems
extended using growfs(8).  The problem here is that geom_label checks if
the filesystem size recorded in UFS superblock is equal to the provider
(i.e. device) size.  This check cannot be removed due to backward
compatibility.  On the other hand, in most cases growfs(8) cannot set
fs_size in the superblock to match the provider size, because, differently
from newfs(8), it cannot recompute cylinder group sizes.

To fix this problem, add another superblock field, fs_providersize, used
only for this purpose.  The geom_label(4) will attach if either fs_size
(filesystem created with newfs(8)) or fs_providersize (filesystem expanded
using growfs(8)) matches the device size.

PR:		kern/165962
Reviewed by:	mckusick
Sponsored by:	FreeBSD Foundation
2012-10-30 21:32:10 +00:00
Matthew D Fleming
e25a029eb2 Fix sbin/ build with a 64-bit ino_t.
Original code by:	Gleb Kurtsou
2012-09-27 23:31:06 +00:00
Eitan Adler
08084125ee Fix warning when compiling with gcc46:
error: variable 'c' set but not used

Approved by:	dim
MFC after:	3 days
2012-01-10 02:59:09 +00:00
Ed Schouten
1efe3c6b58 Add missing static keywords for global variables to tools in sbin/.
These tools declare global variables without using the static keyword,
even though their use is limited to a single C-file, or without placing
an extern declaration of them in the proper header file.
2011-11-04 13:36:02 +00:00
Colin Percival
c2805605f7 Stop trying to zero UFS1 superblocks if we fall off the end of the disk.
This avoids a potentially many-hours-long loop of failed writes if newfs
finds a partially-overwritten superblock (or, for that matter, random
garbage which happens to have superblock magic bytes); on one occasion I
found newfs trying to zero 800 million superblocks on a 50 MB disk.

Reviewed by:	mckusick
MFC after:	1 week
2011-04-26 02:06:31 +00:00
Kirk McKusick
7649cb0043 The dump, fsck_ffs, fsdb, fsirand, newfs, makefs, and quot utilities
include sys/time.h instead of time.h. This include is incorrect as
per the manpages for the APIs and the POSIX definitions. This commit
replaces sys/time.h where necessary with time.h.

The commit also includes some minor style(9) header fixup in newfs.

This commit is part of a larger effort by Garrett Cooper started in
//depot/user/gcooper/posix-conformance-work/ -- to make FreeBSD more
POSIX compliant.

Submitted by:  Garrett Cooper   yanegomi at gmail dot com
2011-01-24 06:17:05 +00:00
Konstantin Belousov
a738d4cf20 Add support for FS_TRIM to user-mode UFS utilities.
Reviewed by:	mckusick, pjd, pho
Tested by:	pho
MFC after:	1 month
2010-12-29 12:31:18 +00:00
Kirk McKusick
8d408dff91 Reported problem:
Large (60GB) filesystems created using "newfs -U -O 1 -b 65536 -f 8192"
show incorrect results from "df" for free and used space when mounted
immediately after creation. fsck on the new filesystem (before ever
mounting it once) gives a "SUMMARY INFORMATION BAD" error in phase 5.

This error hasn't occurred in any runs of fsck immediately after
"newfs -U -b 65536 -f 8192" (leaving out the "-O 1" option).

Solution:
The default UFS1 superblock is located at offset 8K in the filesystem
partition; the default UFS2 superblock is located at offset 64K in
the filesystem partition. For UFS1 filesystems with a blocksize of
64K, the first alternate superblock resides at 64K which is the the
location used for the default UFS2 superblock. By default, the
system first checks for a valid superblock at the default location
for a UFS2 filoesystem. For a UFS1 filesystem with a blocksize of
64K, there is a valid UFS1 superblock at this location.  Thus, even
though it is expected to be a backup superblock, the system will
use it as its default superblock. So, we have to ensure that all the
statistcs on usage are correct in this first alternate superblock
as it is the superblock that will actually be used.

While tracking down this problem, another limitation of UFS1 became
evident. For UFS1, the number of inodes per cylinder group is stored
in an int16_t. Thus the maximum number of inodes per cylinder group
is limited to 2^15 - 1. This limit can easily be exceeded for block
sizes of 32K and above. Thus when building UFS1 filesystems, newfs
must limit the number of inodes per cylinder group to 2^15 - 1.

Reported by: Guy Helmer<ghelmer@palisadesys.com>
Followup by: Bruce Cran <brucec@freebsd.org>
PR:          107692
MFC after:   4 weeks
2010-09-24 19:08:56 +00:00
Maxim Sobolev
e0999e592b o bdeficize expand_number_int() function;
o revert most of the recent changes (int -> int64_t conversion) by using
this functon for parsing all options.
2010-03-09 19:31:08 +00:00
Warner Losh
683d4eac76 Cast these to intmax_t before printing to fix build bustage. Better
solutions welcome.
2010-03-03 21:53:25 +00:00
Kirk McKusick
81479e688b One last pass to get all the unsigned comparisons correct. 2010-02-11 18:14:53 +00:00
Kirk McKusick
cb464c69c0 Ensure that newfs will never create a filesystem with more than 2^32
inodes by cutting back on the number of inodes per cylinder group if
necessary to stay under the limit. For a default (16K block) file
system, this limit begins to take effect for file systems above 32Tb.

This fix is in addition to -r203763 which corrected a problem in the
kernel that treated large inode numbers as negative rather than unsigned.
For a default (16K block) file system, this bug began to show up at a
file system size above about 16Tb.

Reported by: Scott Burns, John Kilburg, Bruce Evans
Followup by: Jeff Roberson
PR:          133980
MFC after:   2 weeks
2010-02-10 20:17:46 +00:00
Martin Blapp
1457e0cdac Fix typo: s/partion/partition/
Submitted by:	Marc Balmer <marc@msys.ch>
MFC after:	3 days
2010-01-02 17:32:40 +00:00
Olivier Houchard
02dda28606 Don't add a bwrite() symbol, it breaks the build when building newfs
statically.
Instead, bring in a stripped down version of sbwrite(), and add the offset
to every bwrite() calls.
2009-02-12 15:28:15 +00:00
Luigi Rizzo
64c8fef580 Enable operation of newfs on plain files, which is useful when you
want to prepare disk images for emulators (though 'makefs' in port
can do something similar).

This relies on:
+ minor changes to pass the consistency checks even when working on a file;

+ an additional option, '-p partition' , to specify the disk partition to
  initialize;

+ some changes on the I/O routines to deal with partition offsets.

The latter was a bit tricky to implement, see the details in newfs.h:
in newfs, I/O is done through libufs which assumes that the file
descriptor refers to the whole partition. Introducing support for
the offset in libufs would require a non-backward compatible change
in the library, to be dealt with a version bump or with symbol
versioning.

I felt both approaches to be overkill for this specific application,
especially because there might be other changes to libufs that might
become necessary in the near future.

So I used the following trick:
- read access is always done by calling bread() directly, so we just add
  the offset in the (few) places that call bread();
- write access is done through bwrite() and sbwrite(), which in turn
  calls bwrite(). To avoid rewriting sbwrite(), we supply our own version
  of bwrite() here, which takes precedence over the version in libufs.

MFC after:	4 weeks
2008-12-03 18:36:59 +00:00
Xin LI
a6a568708b Use calloc(). 2008-03-05 23:17:19 +00:00
Poul-Henning Kamp
59c0f72857 Report erase interval (correctly) in sectors. 2007-12-16 20:19:55 +00:00
Poul-Henning Kamp
9a6378d803 Rename the undocumented -E option to -X.
Implement -E option which will erase the filesystem sectors before
making the new filesystem.  Reserved space in front of the superblock
(bootcode) is not erased.

NB: Erasing can take as long time as writing every sector sequentially.

This is relevant for all flash based disks which use wearlevelling.
2007-12-16 19:41:31 +00:00
Pawel Jakub Dawidek
868c68ed1d Add -J flag to both newfs(8) and tunefs(8) which allows to enable gjournal
support.
I left -j flag for UFS journal implementation which we may gain at some
point.

Sponsored by:	home.pl
2006-10-31 21:52:28 +00:00
Xin LI
3a6ab3de8d Explicitly say which gid do we use as a fallback, when operator
is not found.

Suggested by:	kensmith
2006-09-27 05:49:21 +00:00
Ian Dowse
9405aea2e2 Don't treat failure to find the operator GID as a fatal error; this
made it impossible to use newfs (and mdmfs) when /etc/group is
missing and /etc is read-only.
2005-08-14 17:07:04 +00:00
Xin LI
3ae329b8d2 When creating a new FFS file system, the block size will indirectly
affect the largest file size that is allowed by the file system.
On the other hand, when creating a snapshot, the snapshot file will
appear as it is as big as the file system itself.  Hence we will not
be able to create a file system on large file systems with small
block sizes.

Add a warning about this, and gives some hints to correct the issue.

Reviewed by:	mckusick
MFC After:	1 week
2005-02-20 06:33:18 +00:00
Wes Peters
34b59b6bf2 Add an option to suppress the creation of the .snap directory in
the new filesystem.  This is intended for memory and vnode filesystems
that will never be fsck'ed or dumped.

Obtained from:	St. Bernard Software RAPID
MFC after:	2 weeks
2005-01-21 22:20:25 +00:00
John Baldwin
b72ea57f3b Generalize the UFS bad magic value used to determine when a filesystem
has only been partly initialized via newfs(8) so that it applies to both
UFS1 and UFS2.

Submitted by:	"Xin LI" delphij at frontfree dot net
MFC:		maybe?
2004-08-19 11:09:13 +00:00
Mark Murray
4c723140a4 Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core, imp
2004-04-09 19:58:40 +00:00
Robert Watson
ce20d788fa Add a "-l" flag to newfs, which sets the FS_MULTILABEL flag. This
permits users of newfs to set the multilabel flag on UFS1 and UFS2
file systems from inception without using tunefs.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, McAfee Research
2004-02-26 01:14:27 +00:00
Wes Peters
96982f9bfd Fix whitespace error in previous commit.
Approved by:	RE@ (Robert Watson)
2003-11-27 01:19:23 +00:00
Wes Peters
f44ec7f89e Don't use UFS2_BAD_MAGIC on UFS (v1) filesystems; it is Not Ready
for Prime Time there.

Submitted by:	Xin LI <delphij@frontfree.net>
Approved by:	RE@ (John, Scott)
2003-11-23 08:29:01 +00:00
Wes Peters
0af4e34b2e Add the -E command line option to force error conditions for testing.
Sponsord by:	St. Bernard Software
2003-11-16 07:17:30 +00:00
Wes Peters
ec52df8eb9 Write the UFS2 superblock with a 'BAD' magic number at the beginning
of newfs, to signify the newfs operation has not yet completed.  Re-
write the superblock with the correct magic number once all of the
cylinder groups have been created to show the operation has finished.

Sponsored by:	St. Bernard Software
2003-11-16 07:08:27 +00:00
Kirk McKusick
524ee1107f Create a .snap directory mode 770 group operator in the root of
a new filesystem. Dump and fsck will create snapshots in this
directory rather than in the root for two reasons:

1) For terabyte-sized filesystems, the snapshot may require many
   minutes to build. Although the filesystem will not be suspended
   during most of the snapshot build, the snapshot file itself is
   locked during the entire snapshot build period. Thus, if it is
   accessed during the period that it is being built, the process
   trying to access it will block holding its containing directory
   locked. If the snapshot is in the root, the root will lock and
   the system will come to a halt until the snapshot finishes. By
   putting the snapshot in a subdirectory, it is out of the likely
   path of any process traversing through the root and hence much
   less likely to cause a lock race to the root.

2) The dump program is usually run by a non-root user running with
   operator group privilege. Such a user is typically not permitted
   to create files in the root of a filesystem. By having a directory
   in group operator with group write access available, such a user
   will be able to create a snapshot there. Having the dump program
   create its snapshot in a subdirectory below the root will benefit
   from point (1) as well.

Sponsored by:   DARPA & NAI Labs.
2003-11-04 07:34:32 +00:00
Yaroslav Tykhiy
244fca1ffa Exit with a non-zero status upon a block allocation failure.
The old way of just returning could result in a file system
extremely likely to panic the kernel.  The warning printed
wouldn't help much since tools invoking newfs(8), e.g., mdmfs(8),
couldn't detect the error.

PR:		bin/55078
MFC after:	1 week
2003-08-05 13:35:17 +00:00
Doug Barton
a32bb1b53a When newfs'ing a partition with UFS2 that had previously been newfs'ed
with UFS1, the UFS1 superblocks were not deleted. This allowed any
RELENG_4 (or other non-UFS2-aware) fsck to think it knew how to "fix"
the file system, resulting in severe data scrambling.

This patch is a more advanced version than the one originally submitted.
Lukas improved it based on feedback from Kirk, and testing by me. It
blanks all UFS1 superblocks (if any) during a UFS2 newfs, thereby causing
fsck's that are not UFS2 aware to generate the "SEARCH FOR ALTERNATE
SUPER-BLOCK FAILED" message, and exit without damaging the fs.

PR:		bin/51619
Submitted by:	Lukas Ertl <l.ertl@univie.ac.at>
Reviewed by:	kirk
Approved by:	re (scottl)
2003-05-22 18:38:54 +00:00
Ian Dowse
7bdf1805b1 Put back the error checking in wtfs() that was lost when newfs was
changed to use libufs in revision 1.71. Without this, any write
failures in newfs were silently ignored.

Note that this will display a meaningless errno string in the case
of a short write as opposed to a write error, since bwrite()'s
return value does not allow the caller to determine if errno is
valid.

Reported by:	Lukas Ertl <l.ertl@univie.ac.at>
Reviewed by:	jmallett
Approved by:	re (bmah)
2003-05-10 18:58:17 +00:00
David E. O'Brien
c69284ca08 Use __FBSDID() to quiet GCC 3.3 warnings. 2003-05-03 18:41:59 +00:00
Kirk McKusick
e27c9f46ad Fix the -R flag so that it provides sequential "random" numbers
so that the regression test will succeed.

Sponsored by:   DARPA & NAI Labs.
2003-02-22 23:26:11 +00:00
Kirk McKusick
aca3e4974f Replace use of random() with arc4random() to provide less guessable
values for the initial inode generation numbers in newfs and for
newly allocated inode generation numbers in the kernel.

Submitted by:	Theo de Raadt <deraadt@cvs.openbsd.org>
Sponsored by:   DARPA & NAI Labs.
2003-02-14 21:31:58 +00:00
Kirk McKusick
363c185255 Correct lines incorrectly added to the copyright message. Add missing period.
Submitted by:	Bruce Evans <bde@zeta.org.au>
Sponsored by:   DARPA & NAI Labs.
2003-02-14 21:08:14 +00:00
Juli Mallett
fc903aa525 Convert newfs to libufs (really). Solves one real issue with previous
version of such.  Differences in filesystems generated were found to be
from 1) sbwrite with the "all" parameter 2) removal of writecache.  The
sbwrite call was made to perform as the original version, and otherwise
this was checked against a version of newfs with the write cache removed.
2003-02-11 03:06:45 +00:00
Gordon Tetlow
c715b047bc Bring in support for volume labels to the filesystem utilities.
Reviewed by:	mckusick
2003-02-01 04:17:10 +00:00
Juli Mallett
5a29754e3f Back out conversion to libufs, for now. It seems to cause problems.
Reported by: phk
2003-01-29 22:52:27 +00:00