Add functionality for extents validation inside the filesystem
extents block. The main logic is implemented under
ext4_validate_extent_entries() function, which verifies extents
or extents indexes depending of extent depth value.
PR: 259112
Reported by: Robert Morris
Reviewed by: pfg
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D33375
Rename ext2_dirbadentry() to ext2_check_direntry(). Add directory
entry inode value check, and call ext2_check_direntry() in all cases.
The dirchk sysctl is removed.
PR: 259024,259041
Reported by: Robert Morris
Reviewed by: pfg
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D33374
The cookies argument is only used by the NFS server. NFSv2 defines the
cookie as 32 bits on the wire, but NFSv3 increased it to 64 bits. Our
VOP_READDIR, however, has always defined it as u_long, which is 32 bits
on some architectures. Change it to 64 bits on all architectures. This
doesn't matter for any in-tree file systems, but it matters for some
FUSE file systems that use 64-bit directory cookies.
PR: 260375
Reviewed by: rmacklem
Differential Revision: https://reviews.freebsd.org/D33404
This prevents a kernel panic on a damaged ext2 superblock.
PR: 259107
Reported by: Robert Morris <rtm@lcs.mit.edu>
Differential Revision: https://reviews.freebsd.org/D33029
These began to become obsolete in d6d64f0f2c (r137739) and the deal
was later sealed in 003e18aef4 (r137801) when vfs.fifofs.fops was
dropped and vop-bypass for pipes became mandatory.
PR: 225934
Suggested by: markj
Reviewe by: kib, markj
Differential Revision: https://reviews.freebsd.org/D32270
It is needed to invalidate cache in case of inode space removal
to avoid situation, when extents cache returns not exist extent.
Reviewed by: pfg
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D29931
It is possible to walk thru inode extents if EXT2FS_PRINT_EXTENTS
macro is defined. The extents headers magics and physical blocks
ranges are checked during extents walk.
Reviewed by: pfg
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D29932
The dev field is placed into the inode structure.
The major/minor numbers conversion to/from linux compatile
format happen during on-disk inodes writing/reading.
Reviewed by: pfg
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D29930
The birthtime field of struct vattr does not checked
for VNOVAL in case of ext2_setattr() and produce incorrect
inode birthtime values.
Found using pjdfstest:
pjdfstest/tests/utimensat/03.t
Reviewed by: pfg
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D29929
The data is only needed by filesystems that
1. use buffer cache
2. utilize clustering write support.
Requested by: mjg
Reviewed by: asomers (previous version), fsu (ext2 parts), mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28679
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.
Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*). Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.
Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys. Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight. Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.
Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.
Suggested by: mav (*)
Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27225
The XTIME_TO_NSEC macro already calls the htole32(), so there is no need
to call it twice. This code does nothing on LE platforms and affects only
nanosecond and birthtime fields so it's difficult to notice on regular use.
Hinted by: DragonFlyBSD (git ae503f8f6f4b9a413932ffd68be029f20c38cab4)
X-MFC with: r361136
The NSEC_TO_XTIME macro already calls the htole32(), so there is no need
to call it twice. This code does nothing on LE platforms and affects only
nanosecond and birthtime fields so it's difficult to notice on regular use.
X-MFC with: r361136
This restriction already present in case of indirect mapping, do the same
in case of extents.
PR: 246182
Reported by: Teran McKinney
MFC after: 2 weeks
Make ext2fs compatible with changes introduced in e2fsprogs v1.45.2.
Now the tail of inode bitmap is filled with 0xff pattern explicitly during
bitmap initialization phase to avoid e2fsck error like:
"Padding at end of inode bitmap is not set."
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT
Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718
Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.
Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D21427
The current vnode layout is not smp-friendly by having frequently read data
avoidably sharing cachelines with very frequently modified fields. In
particular v_iflag inspected for VI_DOOMED can be found in the same line with
v_usecount. Instead make it available in the same cacheline as the v_op, v_data
and v_type which all get read all the time.
v_type is avoidably 4 bytes while the necessary data will easily fit in 1.
Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new
flag field with a new value: VIRF_DOOMED.
Reviewed by: kib, jeff
Differential Revision: https://reviews.freebsd.org/D22715
Current implementation of vnode_create_vobject() and
vnode_destroy_vobject() is written so that it prepared to handle the
vm object destruction for live vnode. Practically, no filesystems use
this, except for some remnants that were present in UFS till today.
One of the consequences of that model is that each filesystem must
call vnode_destroy_vobject() in VOP_RECLAIM() or earlier, as result
all of them get rid of the v_object in reclaim.
Move the call to vnode_destroy_vobject() to vgonel() before
VOP_RECLAIM(). This makes v_object stable: either the object is NULL,
or it is valid vm object till the vnode reclamation. Remove code from
vnode_create_vobject() to handle races with the parallel destruction.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D21412