check hash to cylinder groups. If a check hash fails when a cylinder
group is read, no further allocations are attempted in that cylinder
group until it has been fixed by fsck. This avoids a class of
filesystem panics related to corrupted cylinder group maps. The
hash is done using crc32c.
Check hases are added only to UFS2 and not to UFS1 as UFS1 is primarily
used in embedded systems with small memories and low-powered processors
which need as light-weight a filesystem as possible.
Specifics of the changes:
sys/sys/buf.h:
Add BX_FSPRIV to reserve a set of eight b_xflags that may be used
by individual filesystems for their own purpose. Their specific
definitions are found in the header files for each filesystem
that uses them. Also add fields to struct buf as noted below.
sys/kern/vfs_bio.c:
It is only necessary to compute a check hash for a cylinder
group when it is actually read from disk. When calling bread,
you do not know whether the buffer was found in the cache or
read. So a new flag (GB_CKHASH) and a pointer to a function to
perform the hash has been added to breadn_flags to say that the
function should be called to calculate a hash if the data has
been read. The check hash is placed in b_ckhash and the B_CKHASH
flag is set to indicate that a read was done and a check hash
calculated. Though a rather elaborate mechanism, it should
also work for check hashing other metadata in the future. A
kernel internal API change was to change breada into a static
fucntion and add flags and a function pointer to a check-hash
function.
sys/ufs/ffs/fs.h:
Add flags for types of check hashes; stored in a new word in the
superblock. Define corresponding BX_ flags for the different types
of check hashes. Add a check hash word in the cylinder group.
sys/ufs/ffs/ffs_alloc.c:
In ffs_getcg do the dance with breadn_flags to get a check hash and
if one is provided, check it.
sys/ufs/ffs/ffs_vfsops.c:
Copy across the BX_FFSTYPES flags in background writes.
Update the check hash when writing out buffers that need them.
sys/ufs/ffs/ffs_snapshot.c:
Recompute check hash when updating snapshot cylinder groups.
sys/libkern/crc32.c:
lib/libufs/Makefile:
lib/libufs/libufs.h:
lib/libufs/cgroup.c:
Include libkern/crc32.c in libufs and use it to compute check
hashes when updating cylinder groups.
Four utilities are affected:
sbin/newfs/mkfs.c:
Add the check hashes when building the cylinder groups.
sbin/fsck_ffs/fsck.h:
sbin/fsck_ffs/fsutil.c:
Verify and update check hashes when checking and writing cylinder groups.
sbin/fsck_ffs/pass5.c:
Offer to add check hashes to existing filesystems.
Precompute check hashes when rebuilding cylinder group
(although this will be done when it is written in fsutil.c
it is necessary to do it early before comparing with the old
cylinder group)
sbin/dumpfs/dumpfs.c
Print out the new check hash flag(s)
sbin/fsdb/Makefile:
Needs to add libufs now used by pass5.c imported from fsck_ffs.
Reviewed by: kib
Tested by: Peter Holm (pho)
partitioning scheme.
Users often get confused and frustrated when trying to delete partition
table and getting ``Device busy'' error because they forgot (or did not
ever know that they have) to delete all its partitions first, and while
the manual page mentions this briefly, it does not stress it out enough.
Approved by: ae, manpages (bjk)
PR (as inspiration): 196102
Differential Revision: https://reviews.freebsd.org/D12336
Any geom class using g_metadata_store, as well as geom_virstor which
duplicated g_metadata_store internally, would dump sectorsize - mdsize bytes
of userspace memory following the metadata block stored. This is most or all
geom classes (gcache, gconcat, geli, gjournal, glabel, gmirror, gmultipath,
graid3, gshsec, gstripe, and geom_virstor).
PR: 222077 (comment #3)
Reported by: Maxim Khitrov <max AT mxcrypt.com>
Reviewed by: des
Security: yes
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12269
superblocks created in revision 322297 only works on disks
with sector sizes up to 4K. This update allows the recovery
information to be created by newfs and used by fsck on disks
with sector sizes up to 64K. Note that FFS currently limits
filesystem to be mounted from disks with up to 8K sectors.
Expanding this limitation will be the subject of another
commit.
Reported by: Peter Holm
Reviewed with: kib
This feature comes from the fact that we rely memory-backed md(4)
in our build process heavily. However, if the build goes haywire
the allocated resources (i.e. swap and memory-backed md(4)'s) need
to be purged. It is extremely useful to have ability to attach
arbitrary labels to each of the virtual disks so that they can
be identified and GC'ed if neecessary.
MFC after: 4 weeks
Differential Revision: https://reviews.freebsd.org/D10457
Non-tests/... changes:
- Add HAS_TESTS= to Makefiles with libraries and programs to enable iteration
and propagate the appropriate environment down to *.test.mk.
tests/... changes:
- Add appropriate support Makefile.inc's to set HAS_TESTS in a minimal manner,
since tests/... is a special subdirectory tree compared to the others.
MFC after: 2 months
MFC with: r322511
Reviewed by: arch (silence), testing (silence)
Differential Revision: D12014
unable to automatically find alternate superblocks. This checkin
places the information needed to find alternate superblocks to the
end of the area reserved for the boot block.
Filesystems created with a newfs of this vintage or later will
create the recovery information. If you have a filesystem created
prior to this change and wish to have a recovery block created for
your filesystem, you can do so by running fsck in forground mode
(i.e., do not use the -p or -y options). As it starts, fsck will
ask ``SAVE DATA TO FIND ALTERNATE SUPERBLOCKS'' to which you should
answer yes.
Discussed with: kib, imp
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D11589
a mismatch, but will allow fsck to continue when the last alternate
superblock gets corrupted somehow.
Also, remove searching for alternate super blocks. It should have been
removed two years ago with r276737 by imp@. Leave minor vestiges in
place in case someone wants to solve the hard problem of knowing where
altnernate superblocks live without access to data formerly stored in
disklabels.
Differential Revision: https://reviews.freebsd.org/D11589
ifconfig(8) printing the hwaddr is only really useful if it differs from
the link layer address.
Reported by: jhb
Reviewed by: rpokala
Approved by: rstone (mentor)
Differential Revision: https://reviews.freebsd.org/D11777
directories to SUBDIR.${MK_TESTS} idiom
This is being done to pave the way for future work (and homogenity) in
^/projects/make-check-sandbox .
No functional change intended.
MFC after: 1 weeks
The intent is to skip expensive opaque sysctls like tcp_pcblist unless
they are explicitly requested. Sysctl nodes like this don't show up in
sysctl -a, but they do generate output that winds up being dropped,
unless the user specifically requested binary/hex output or opaques.
This reduces the runtime of sysctl in many circumstances on a loaded
system. It also reduces the likelihood that simply gathering
diagnostics on a sick machine (stuck lock, etc) via sysctl -a might
push it over the edge into a total lockup.
Reviewed by: jtl
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D11461
point.
The new "-N" option does a forced dismount of an NFS mount point, but avoids
doing any checking of the mounted-on path, so that it will not get hung
when a vnode lock is held by another hung process on the mounted-on vnode.
The most common case of this is a "umount" with the "-f" option.
Other than avoiding checking the mounted-on path, it performs the same
forced dismount as a successful "umount -f" would do.
This commit includes a content change to the man page.
Tested by: pho
Reviewed by: kib
MFC after: 2 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D11735
Copy the most important test cases from OpenBSD's corresponding
src/regress/sbin/pfctl, those that run pfctl on a test input file and check
correctness of its output. We have also added some new tests using the same
format.
The tests consist of a collection of input files (pf*.in) and
corresponding output files (pf*.ok). We run pfctl -nv on the input
files and check that the output matches the output files. If any
discrepancy is discovered during future development in the source
tree, we know that a regression bug has been introduced into the tree.
Submitted by: paggas
Sponsored by: Google, Inc (GSoC 2017)
Differential Revision: https://reviews.freebsd.org/D11322
- Delete trailing whitespace.
- Replace 8 single column spaces with hard tabs.
- Delete lines with consisting purely of blank space.
- Add space between `return` and `(`, per style(9).
Special care was taken to not blindly replace 8 single column spaces
with tabs; doing so could break tools that do strict string comparisons
with camcontrol output.
This basically makes "mount -uw /" work when the filesystem
mounted on / is NFS, but the one configured in fstab(5) is UFS,
which can happen when you forget to modify fstab.
Note that the whole special case ("else if (argv[0][0] == '/'")
is probably not needed anyway. I'll take a look at removing it
altogether; for now this is a minimally intrusive fix.
MFC after: 2 weeks
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D11323
so don't imply that. Note that if BIO_DELETE isn't supported, the
operation will fail (as opposed to writing the entire disk with
zeros). Thin storage also benefits from trim. List more accurate
reason why trim helps flash-memory.
After the addition of SUBDIR.yes, uniquifying/ordering the SUBDIRs doesn't
make a whole lot of sense, and it's in effect a half measure.
Ordering SUBDIR (after adding SUBDIR.yes to it) in bsd.subdir.mk is a
separate change that warrants more discussion/testing, because while
the SUBDIR_PARALLEL work largely fixed dependency ordering for SUBDIRs,
there might be downstream FreeBSD consumers that rely on the SUBDIR
ordering.
MFC after: 2 months
Reviewed by: bdrewery
Differential Revision: D11398
After review by the WDC engineers, improve how we pull down the
so-called 'e6' logs. The 'c6' logs are obsolete and support for them
has been removed because FreeBSD needed to pull them in chunks, which
is incompatible with the 0xc6 opcode implementation. Rather than leave
the code in place that produces bad log pulls, remove it.
Since buildenv exports SYSROOT all of these uses will now look in
WORLDTMP by default.
sys/boot/efi/loader/Makefile
A LIBSTAND hack is no longer required for buildenv.
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Its purpose was to translate the values for msdosfs inode numbers,
which is calculated from the msdosfs structures describing the file,
into the range representable by 32bit ino_t. The translation acted
for filesystems larger than 128Gb, it reserved the range 0xf0000000
(FILENO_FIRST_DYN) to UINT32_MAX and remembered some arbitrary
translation of ino >= FILENO_FIRST_DYN into this range. It consumed
memory that could be only freed by unmount, and the translation was
not stable across remounts.
With ino_t type extended to 64 bit, there is no such issue and values
can be returned without compaction to 32bit. That is, for the native
environments, the translation layer is not necessary and adds
significant undeserved code complexity. For compat ABIs which use
32bit ino_t, the vfs.ino64_trunc_error sysctl provides some measures
to soften the failure mode when inode numbers truncation is not safe.
Discussed with: bde
Sponsored by: The FreeBSD Foundation