2171 Commits

Author SHA1 Message Date
mckusick
de7c6145de A race condition existed between the time a UFS/FFS superblock check
hash was computed and the time that the superblock was copied to a
buffer to be written to disk. The result was a failed superblock
check hash the next time that the superblock was read.

The fix is to compute the check hash after the superblock has been
copied to a buffer to be written.

PR:           236504
Reported by:  Peter Holm
Tested by:    Peter Holm
Sponsored by: Netflix
2019-08-06 18:10:34 +00:00
mckusick
7f0988ede6 When updating the user or group disk quotas for the return of inodes or
disk blocks, set the FORCE flag in the call to chkiq() or chkdq() since
the user is always allowed to return resources and hence there is no need
to check the user's credential .

Reported by:    Christopher Krah, Thomas Barabosch, and Jan-Niclas Hilgert of Fraunhofer FKIE
Reported as:    FS-1-UFS-1: Denial Of Service in mount (prison_priv_check)
Discussed with: kib
MFC:            1 week
Sponsored by:   Netflix
2019-07-31 22:44:58 +00:00
rmacklem
577fca0e20 Lock the vnode before calling ufs_bmap_seekdata().
r346932 replaced a call to vn_bmap_seekhole() with a call to
ufs_bmap_seekdata(). Although vn_bmap_seekhole() locks the vnode,
ufs_bmap_seekdata() assumes it is already locked.
This patch adds locking of the vnode before the ufs_bmap_seekdata() call.
If the vn_lock() call fails, it returns EBADF since that is the normal
error returned when a file system is forced dismounted and is already
listed as an error return in the lseek(2) man page.

Discussed with:	markj
Reviewed by:	kib
2019-07-27 01:52:34 +00:00
mckusick
eb5503c168 The error reported in FS-14-UFS-3 can only happen on UFS/FFS
filesystems that have block pointers that are out-of-range for their
filesystem. These out-of-range block pointers are corrected by
fsck(8) so are only encountered when an unchecked filesystem is
mounted.

A new "untrusted" flag has been added to the generic mount interface
that can be set when mounting media of unknown provenance or integrity.
For example, a daemon that automounts a filesystem on a flash drive
when it is plugged into a system.

This commit adds a test to UFS/FFS that validates all block numbers
before using them. Because checking for out-of-range blocks adds
unnecessary overhead to normal operation, the tests are only done
when the filesystem is mounted as an "untrusted" filesystem.

Reported by:  Christopher Krah, Thomas Barabosch, and Jan-Niclas Hilgert of Fraunhofer FKIE
Reported as:  FS-14-UFS-3: Out of bounds read in write-2 (ffs_alloccg)
Reviewed by:  kib
Sponsored by: Netflix
2019-07-17 22:07:43 +00:00
mckusick
895b89d33b Style.
No change intended.
2019-07-16 23:39:39 +00:00
mckusick
654c8c5036 When a process attempts to allocate space on a full filesystem, a
filesystem full message is sent to the offending process or the
kernel log if the offending process cannot be identified.

To prevent an explotion of messages, the kernel ppsratecheck()
function is used to limit the messages to one per second. This
revision changes the variable that tracks the rate of these messages
from a systemwide limit to a per-filesystem limit by moving it from
a global variable to a variable in the ufsmount structure.

Suggested by: kib
Reviewed by:  kib
Sponsored by: Netflix
2019-07-16 23:12:27 +00:00
mckusick
0a19651018 Add a new "untrusted" option to the mount command. Its purpose
is to notify the kernel that the file system is untrusted and it
should use more extensive checks on the file-system's metadata
before using it. This option is intended to be used when mounting
file systems from untrusted media such as USB memory sticks or other
externally-provided media.

It will initially be used by the UFS/FFS file system, but should
likely be expanded to be used by other file systems that may appear
on external media like msdosfs, exfat, and ext2fs.

Reviewed by:  kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D20786
2019-07-01 23:22:26 +00:00
markj
944c67f6c1 Remove references to splbio in ffs_softdep.c.
Assert that the per-mountpoint softdep mutex is held in modified
functions that do not already have this assertion.  No functional
change intended.

Reviewed by:	kib, mckusick (previous version)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20741
2019-06-26 16:28:42 +00:00
asomers
2672f1ee16 Add FIOBMAP2 ioctl
This ioctl exposes VOP_BMAP information to userland. It can be used by
programs like fragmentation analyzers and optimized cp implementations. But
I'm using it to test fusefs's VOP_BMAP implementation. The "2" in the name
distinguishes it from the similar but incompatible FIBMAP ioctls in NetBSD
and Linux.  FIOBMAP2 differs from FIBMAP in that it uses a 64-bit block
number instead of 32-bit, and it also returns runp and runb.

Reviewed by:	mckusick
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20705
2019-06-20 14:13:10 +00:00
delphij
8581c5bfb9 Separate kernel crc32() implementation to its own header (gsb_crc32.h) and
rename the source to gsb_crc32.c.

This is a prerequisite of unifying kernel zlib instances.

PR:		229763
Submitted by:	Yoshihiro Ota <ota at j.email.ne.jp>
Differential Revision:	https://reviews.freebsd.org/D20193
2019-06-17 19:49:08 +00:00
mckusick
0d342021e5 Add a missing bresle() in seldom-used error return. 2019-05-28 17:31:35 +00:00
mckusick
447eec25c7 Convert use of UFS-specific #ifdef DEBUG to DIAGNOSTIC or INVARIANTS
as appropriate. No functional change intended.

Suggested-by: markj
2019-05-28 16:32:04 +00:00
mckusick
0776148e01 Add function name and line number debugging information to softupdates
worklist structures to help track their movement between work lists.
No functional change to the operation of soft updates intended.
2019-05-27 06:22:43 +00:00
asomers
6798617f6d Remove "struct ucred*" argument from vtruncbuf
vtruncbuf takes a "struct ucred*" argument. AFAICT, it's been unused ever
since that function was first added in r34611. Remove it.  Also, remove some
"struct ucred" arguments from fuse and nfs functions that were only used by
vtruncbuf.

Reviewed by:	cem
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20377
2019-05-24 20:27:50 +00:00
cem
3038f1af7b Include ktr.h in more compilation units
Similar to r348026, exhaustive search for uses of CTRn() and cross reference
ktr.h includes.  Where it was obvious that an OS compat header of some kind
included ktr.h indirectly, .c files were left alone.  Some of these files
clearly got ktr.h via header pollution in some scenarios, or tinderbox would
not be passing prior to this revision, but go ahead and explicitly include it
in files using it anyway.

Like r348026, these CUs did not show up in tinderbox as missing the include.

Reported by:	peterj (arm64/mp_machdep.c)
X-MFC-With:	r347984
Sponsored by:	Dell EMC Isilon
2019-05-21 20:38:48 +00:00
markj
e346003b41 Ensure that error is initialized in ufs_bmap_seekdata().
Reported and tested by:	jhibbits
MFC with:	r346932
Sponsored by:	The FreeBSD Foundation
2019-05-05 16:57:03 +00:00
kib
2dc0d9edaa Switch to use shared vnode locks for text files during image activation.
kern_execve() locks text vnode exclusive to be able to set and clear
VV_TEXT flag. VV_TEXT is mutually exclusive with the v_writecount > 0
condition.

The change removes VV_TEXT, replacing it with the condition
v_writecount <= -1, and puts v_writecount under the vnode interlock.
Each text reference decrements v_writecount.  To clear the text
reference when the segment is unmapped, it is recorded in the
vm_map_entry backed by the text file as MAP_ENTRY_VN_TEXT flag, and
v_writecount is incremented on the map entry removal

The operations like VOP_ADD_WRITECOUNT() and VOP_SET_TEXT() check that
v_writecount does not contradict the desired change.  vn_writecheck()
is now racy and its use was eliminated everywhere except access.
Atomic check for writeability and increment of v_writecount is
performed by the VOP.  vn_truncate() now increments v_writecount
around VOP_SETATTR() call, lack of which is arguably a bug on its own.

nullfs bypasses v_writecount to the lower vnode always, so nullfs
vnode has its own v_writecount correct, and lower vnode gets all
references, since object->handle is always lower vnode.

On the text vnode' vm object dealloc, the v_writecount value is reset
to zero, and deadfs vop_unset_text short-circuit the operation.
Reclamation of lowervp always reclaims all nullfs vnodes referencing
lowervp first, so no stray references are left.

Reviewed by:	markj, trasz
Tested by:	mjg, pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D19923
2019-05-05 11:20:43 +00:00
mckusick
54d65ae934 Zero out the file directory entry metadata to reduce disk
scavenging disclosure.

Submitted by: David G. Lawrence <dg@dglawrence.com>
MFC after:    1 week
2019-05-04 18:00:57 +00:00
mckusick
c4824e67f5 This update eliminates a kernel stack disclosure bug in UFS/FFS
directory entries that is caused by uninitialized directory entry
padding written to the disk. It can be viewed by any user with read
access to that directory. Up to 3 bytes of kernel stack are disclosed
per file entry, depending on the the amount of padding the kernel
needs to pad out the entry to a 32 bit boundry. The offset in the
kernel stack that is disclosed is a function of the filename size.
Furthermore, if the user can create files in a directory, this 3
byte window can be expanded 3 bytes at a time to a 254 byte window
with 75% of the data in that window exposed. The additional exposure
is done by removing the entry, creating a new entry with a 4-byte
longer name, extracting 3 more bytes by reading the directory, and
repeating until a 252 byte name is created.

This exploit works in part because the area of the kernel stack
that is being disclosed is in an area that typically doesn't change
that often (perhaps a few times a second on a lightly loaded system),
and these file creates and unlinks themselves don't overwrite the
area of kernel stack being disclosed.

It appears that this bug originated with the creation of the Fast
File System in 4.1b-BSD (Circa 1982, more than 36 years ago!), and
is likely present in every Unix or Unix-like system that uses
UFS/FFS. Amazingly, nobody noticed until now.

This update also adds the -z flag to fsck_ffs to have it scrub
the leaked information in the name padding of existing directories.
It only needs to be run once on each UFS/FFS filesystem after a
patched kernel is installed and running.

Submitted by: David G. Lawrence <dg@dglawrence.com>
Reviewed by:  kib
MFC after:    1 week
2019-05-03 21:54:14 +00:00
mckusick
e28330c77b Simplify calculation of DIRECTSIZ. No functional change intended.
Suggested by: kib
MFC after:    1 week
2019-05-03 21:46:25 +00:00
markj
a7a455c299 Optimize lseek(SEEK_DATA) on UFS.
This version fixes the problems identified in r345244.

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19598
2019-04-29 22:05:26 +00:00
kib
aeecbe07e1 Handle races when remounting UFS volume from ro to rw.
In particular, ensure that writers are not unleashed before SU
structures are initialized.  Also, correctly handle MNT_ASYNC before
this.

Reported and tested by:	pho
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-04-08 15:20:05 +00:00
oshogbo
20d273b44b Introduce funlinkat syscall that always us to check if we are removing
the file associated with the given file descriptor.

Reviewed by:	kib, asomers
Reviewed by:	cem, jilles, brooks (they reviewed previous version)
Discussed with:	pjd, and many others
Differential Revision:	https://reviews.freebsd.org/D14567
2019-04-06 09:34:26 +00:00
mckusick
1e2cc9b200 This is an additional and hopefully final fix for bug report 230962.
This bug was introduced with the change to use softdep_bp_to_mp()
in January 2018 changes -r327723 and -r327821. The softdep_bp_to_mp()
function failed to include VSOCK as one of the valid cases.

Although local-domain sockets do not allocate blocks in the filesystem,
they will allocate blocks if they use extended attributes (such as
ACLs). Thus, softdep_bp_to_mp() needs to return a non-NULL mount
pointer when presented with a socket vnode so that the soft updates
write complete will properly process the soft updates structures
associated with the extended attribute blocks. It was the failure
to process these soft updates structures, thus leaving them hanging
off the buffer, which lead to the "panic: softdep_deallocate_dependencies:
dangling deps" when trying to clean up the buffer after it was written.

PR:           230962
Reported by:  2t8mr7kx9f@protonmail.com
Reviewed by:  kib
Tested by:    Peter Holm
MFC after:    1 week
Sponsored by: Netflix
2019-03-20 23:11:05 +00:00
markj
b24a98cb7e Revert r345244 for now.
The code which advances the block number is simplistic and is not
correct when the starting offset is non-zero.  Revert the change until
this is fixed.
2019-03-18 05:03:55 +00:00
markj
72039f2089 Fix the gcc build (-Wstrict-prototypes) after r345244.
Reported by:	jenkins
MFC with:	r345244
2019-03-17 18:06:13 +00:00
markj
3139ccc80b Optimize lseek(SEEK_DATA) on UFS.
The old implementation, at the VFS layer, would map the entire range of
logical blocks between the starting offset and the first data block
following that offset.  With large sparse files this is very
inefficient.  The VFS currently doesn't provide an interface to improve
upon the current implementation in a generic way.

Add ufs_bmap_seekdata(), which uses the obvious algorithm of scanning
indirect blocks to look for data blocks.  Use it instead of
vn_bmap_seekhole() to implement SEEK_DATA.

Reviewed by:	kib, mckusick
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19598
2019-03-17 17:34:06 +00:00
mckusick
12a82b6654 Add KASSERT to the softdep_disk_write_complete() function in the
soft dependency code to ensure that it will be able to avoid a
dangling dependency.

Sponsored by: Netflix
2019-03-12 00:10:31 +00:00
mckusick
382e4a0d84 Give more complete information in INVARIANTS panic messages at end of
the ffs_truncate() function.

Sponsored by: Netflix
2019-03-11 23:53:56 +00:00
mckusick
a60a8e90c2 Augment the UFS filesystem specific print function (called by the
kernel vn_printf() routine when printing out vnodes associated with
a UFS filesystem) to also include the inode's link count, effective
link count, generation number, owner, group, flags, size, and for
UFS2 filesystems, the extent size.

Sponsored by: Netflix
2019-03-11 22:05:34 +00:00
sjg
3e15cbcd4d Add _PC_ACL_* to vop_stdpathconf
This avoid EINVAL from tmpfs etc.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D19512
2019-03-11 20:40:56 +00:00
jah
6bf36155f2 FFS: allow sendfile(2) to work with block sizes greater than the page size
Implement ffs_getpages_async(), which when possible calls the asynchronous
flavor of the generic pager's getpages function. When the underlying
block size is larger than the system page size, however, it will invoke
the (synchronous) buffer cache pager, followed by a call to the client
completion routine. This retains true asynchronous completion in the most
common (block size <= page size) case, which is important for the performance
of the new sendfile(2). The behavior in the larger block size case mirrors
the default implementation of VOP_GETPAGES_ASYNC, which most other
filesystems use anyway as they do not override the getpages_async method.

PR:		235708
Reported by:	pho
Reviewed by:	kib, glebius
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D19340
2019-02-26 04:56:10 +00:00
mckusick
7b71e80a0f After a crash, a file that extends into indirect blocks may end up
shorter than its size resulting in a hole as its final block (which
is a violation of the invarients of the UFS filesystem).

Soft updates will always ensure that the file size is correct when
writing inodes to disk for files that contain only direct block
pointers. However soft updates does not roll back sizes for files
with indirect blocks that it has set to unallocated because their
contents have not yet been written to disk. Hence, the file can
appear to have a hole at its end because the block pointer has been
rolled back to zero when its inode was written to disk. Thus,
fsck_ffs calculates the last allocated block in the file. For files
that extend into indirect blocks, fsck_ffs checks for a size past
the last allocated block of the file and if that is found, shortens
the file to reference the last allocated block thus avoiding having
it reference a hole at its end.

Submitted by: Chuck Silvers <chs@netflix.com>
Tested by:    Chuck Silvers <chs@netflix.com>
MFC after:    1 week
Sponsored by: Netflix
2019-02-25 21:58:19 +00:00
mckusick
0c506150b7 This bug was introduced with the change to use softdep_bp_to_mp() in
January 2018 changes -r327723 and -r327821. The softdep_bp_to_mp()
function failed to include VFIFO as one of the valid cases.

Although fifo's do not allocate blocks in the filesystem, they will
allocate blocks if they use extended attributes (such as ACLs). Thus,
softdep_bp_to_mp() needs to return a non-NULL mount pointer when
presented with a fifo vnode so that the soft updates write complete
will properly process the soft updates structures associated with the
extended attribute blocks. It was the failure to process these soft
updates structures, thus leaving them hanging off the buffer, which
lead to the "panic: softdep_deallocate_dependencies: dangling deps"
when trying to clean up the buffer after it was written.

PR:           230962
Reported by:  2t8mr7kx9f@protonmail.com
Reviewed by:  kib
Tested by:    Peter Holm
MFC after:    1 week
Sponsored by: Netflix
2019-01-28 21:36:45 +00:00
mckusick
23046c2dc3 Expand DDB's set of printable soft dependency data structures. The
set of known soft dependency data structures now includes: sd_worklist,
sd_inodedep, sd_allocdirect, sd_allocindir, and sd_mkdir. DDB can
also print lists of sd_allinodedeps, sd_mkdir_list, and sd_workhead.
The sd_workhead script is useful for listing all the dependencies
associated with a buffer, e.g. bp->b_dep.

Prefix the soft dependency show names with sd_ so that they sort
together when listed by DDB's "show help" and to distinguish them
from other data structures printable by DDB.

Sponsored by: Netflix
2019-01-26 05:35:24 +00:00
glebius
7ee1aa34d4 Allocate pager bufs from UMA instead of 80-ish mutex protected linked list.
o In vm_pager_bufferinit() create pbuf_zone and start accounting on how many
  pbufs are we going to have set.
  In various subsystems that are going to utilize pbufs create private zones
  via call to pbuf_zsecond_create(). The latter calls uma_zsecond_create(),
  and sets a limit on created zone. After startup preallocate pbufs according
  to requirements of all pbuf zones.

  Subsystems that used to have a private limit with old allocator now have
  private pbuf zones: md(4), fusefs, NFS client, smbfs, VFS cluster, FFS,
  swap, vnode pager.

  The following subsystems use shared pbuf zone: cam(4), nvme(4), physio(9),
  aio(4). They should have their private limits, but changing that is out of
  scope of this commit.

o Fetch tunable value of kern.nswbuf from init_param2() and while here move
  NSWBUF_MIN to opt_param.h and eliminate opt_swap.h, that was holding only
  this option.
  Default values aren't touched by this commit, but they probably should be
  reviewed wrt to modern hardware.

This change removes a tight bottleneck from sendfile(2) operation, that
uses pbufs in vnode pager. Other pagers also would benefit from faster
allocation.

Together with:	gallatin
Tested by:	pho
2019-01-15 01:02:16 +00:00
mckusick
e73f4235a2 Move ASSERT_VOP_LOCKED to top of ufs_vinit() as it should be true
when the function is entered.

Suggested by: kib
2018-12-30 06:03:20 +00:00
mckusick
63f92e6bd1 For consistency with FFS2's fifoops2 and both versions of FFS's
vnodeops make FFS1's fifoops1 use ffs_lock. Also delete ffs_reallocblks
from fifoops1 which is needed only for fifoops2 because of its
support for extended attributes that need to allocate blocks.

Suggested by: kib
2018-12-30 05:03:41 +00:00
mckusick
51739d277b When loading an inode from disk, verify that its mode is valid.
If invalid, return EINVAL. Note that inode check-hashes greatly
reduce the chance that these errors will go undetected.

Reported by:  Christopher Krah <krah@protonmail.com>
Reported as:  FS-5-UFS-2: Denial Of Service in nmount-3 (ffs_read)
Reviewed by:  kib
MFC after:    1 week
Sponsored by: Netflix

M    sys/fs/ext2fs/ext2_vnops.c
M    sys/kern/vfs_subr.c
M    sys/ufs/ffs/ffs_snapshot.c
M    sys/ufs/ufs/ufs_vnops.c
2018-12-27 07:18:53 +00:00
kib
c93608ca18 Allocate v_object for the new snapshot vnode.
The vnode is not opened, so it ends up with the malloced buffers otherwise.

Reported and tested by:	pho
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2018-12-23 18:54:09 +00:00
mckusick
bfbf739aad Ensure that the inode check-hash is not left zeroed out in the case where
the check-hash fails. Prior to the fix in -r342133 the inode with the
zeroed out check-hash was written back to disk causing further confusion.

Reported by:  Gary Jennejohn (gj)
Sponsored by: Netflix
2018-12-15 18:49:30 +00:00
mckusick
b27d4425e8 Reorder ffs_verify_dinode_ckhash() so that it checks the inode check-hash
before copying in the inode so that the mode and link-count are not set
if the check-hash fails. This change ensures that the vnode will be properly
unwound and recycled rather than being held in the cache.

Initialize the file mode is zero so that if the loading of the inode
fails (for example because of a check-hash failure), the vnode will be
properly unwound and recycled.

Reported by:  Gary Jennejohn (gj)
Sponsored by: Netflix
2018-12-15 18:35:46 +00:00
mckusick
c376c8363f Must set ip->i_effnlink = ip->i_nlink to avoid a soft updates
"panic: softdep_update_inodeblock: bad link count" when releasing
a partially initialized vnode after an inode check-hash failure.

Reported by:  Gary Jennejohn <gljennjohn@gmail.com>
Reported by:  Peter Holm (pho)
Sponsored by: Netflix
2018-12-15 17:58:42 +00:00
mckusick
830a63af76 Continuing efforts to provide hardening of FFS. This change adds a
check hash to the filesystem inodes. Access attempts to files
associated with an inode with an invalid check hash will fail with
EINVAL (Invalid argument). Access is reestablished after an fsck
is run to find and validate the inodes with invalid check-hashes.
This check avoids a class of filesystem panics related to corrupted
inodes. The hash is done using crc32c.

Note this check-hash is for the inode itself and not any of its
indirect blocks. Check-hash validation may be extended to also
cover indirect block pointers, but that will be a separate (and
more costly) feature.

Check hashes are added only to UFS2 and not to UFS1 as UFS1 is
primarily used in embedded systems with small memories and low-powered
processors which need as light-weight a filesystem as possible.

Reviewed by:  kib
Tested by:    Peter Holm
Sponsored by: Netflix
2018-12-11 22:14:37 +00:00
mjg
7e31d1de7e Remove unused argument to priv_check_cred.
Patch mostly generated with cocinnelle:

@@
expression E1,E2;
@@

- priv_check_cred(E1,E2,0)
+ priv_check_cred(E1,E2)

Sponsored by:	The FreeBSD Foundation
2018-12-11 19:32:16 +00:00
mckusick
a394bed4ee If the vfs.ffs.dotrimcons sysctl option is enabled while a file
deletion is active, specifically after a call to ffs_blkrelease_start()
but before the call to ffs_blkrelease_finish(), ffs_blkrelease_start()
will have handed out SINGLETON_KEY rather than starting a collection
sequence. Thus if we get a SINGLETON_KEY passed to ffs_blkrelease_finish(),
we just return rather than trying to finish the nonexistent sequence.

Reported by:  Warner Losh (imp@)
Sponsored by: Netflix
2018-12-06 01:04:56 +00:00
mckusick
2c9178edde Normally when an attempt is made to mount a UFS/FFS filesystem whose
superblock has a check-hash error, an error message noting the
superblock check-hash failure is printed and the mount fails. The
administrator then runs fsck to repair the filesystem and when
successful, the filesystem can once again be mounted.

This approach fails if the filesystem in question is a root filesystem
from which you are trying to boot. Here, the loader fails when trying
to access the filesystem to get the kernel to boot. So it is necessary
to allow the loader to ignore the superblock check-hash error and make
a best effort to read the kernel. The filesystem may be suffiently
corrupted that the read attempt fails, but there is no harm in trying
since the loader makes no attempt to write to the filesystem.

Once the kernel is loaded and starts to run, it attempts to mount its
root filesystem. Once again, failure means that it breaks to its prompt
to ask where to get its root filesystem. Unless you have an alternate
root filesystem, you are stuck.

Since the root filesystem is initially mounted read-only, it is
safe to make an attempt to mount the root filesystem with the failed
superblock check-hash. Thus, when asked to mount a root filesystem
with a failed superblock check-hash, the kernel prints a warning
message that the root filesystem superblock check-hash needs repair,
but notes that it is ignoring the error and proceeding. It does
mark the filesystem as needing an fsck which prevents it from being
enabled for writing until fsck has been run on it. The net effect
is that the reboot fails to single user, but at least at that point
the administrator has the tools at hand to fix the problem.

Reported by:    Rick Macklem (rmacklem@)
Discussed with: Warner Losh (imp@)
Sponsored by:   Netflix
2018-12-06 00:09:39 +00:00
mckusick
be83b5d4e3 Move the check for the filesystem having been run on a kernel that
predates metadata check hashes so that it is done before deciding
whether to compute a check-hash of the superblock.

Reported by:  Rick Macklem <rmacklem@uoguelph.ca>
Sponsored by: Netflix
2018-11-26 00:58:07 +00:00
mckusick
3216726186 Calculate updated superblock check-hash before writing it into the snapshot.
This corrects a bug that prevented snapshots from being mounted due to a
superblock check-hash failure.

Reported by:  Brennan Vincent <brennan@umanwizard.com>
Tested by:    Peter Holm (pho@)
Sponsored by: Netflix
2018-11-25 18:01:15 +00:00
markj
d08f717585 Ensure that directory entry padding bytes are zeroed.
Directory entries must be padded to maintain alignment; in many
filesystems the padding was not initialized, resulting in stack
memory being copied out to userspace.  With the ino64 work there
are also some explicit pad fields in struct dirent.  Add a subroutine
to clear these bytes and use it in the in-tree filesystems.  The
NFS client is omitted for now as it was fixed separately in r340787.

Reported by:	Thomas Barabosch, Fraunhofer FKIE
Reviewed by:	kib
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2018-11-23 22:24:59 +00:00