The ext4 inode flags do not have equivalents for chflags (1)
and hold information that is private to the implementation.
The i_flag field in the inode is a better place to hold the Ext4
inode flags as it saves us from masking flags while setting or
getting attributes. It should also make things cleaner if we
implement write support for Ext4.
Suggested by: bde
Tested by: Mike Ma
MFC after: 3 days
The IN_* flags should be set in i_flag instead of corrupting
i_flags [1].
Re-enable HTree dirindex as the last series of bug fixes
seems to have fixed the issues.
Reported by: bde [1]
Tested by: kevlo
MFC after: 1 week
Use the bitwise negation instead of bogus boolean negation and move
the flag manipulation with the assignment.
Fix some grammatical errors introduced in the same change.
Reported by: bde
MFC after: 3 days
r260545 cleared the inode flags to fix corruption problems but
we still need to pass some EXT4 flags for the ext4 read-only
mode. None of these attributes has an equivalent in FreeBSD and
are uninteresting for the system utilities so they should be
innaccessible in ext2_getattrib().
Note: we also use EXT4_HUGE_FILE but we use it directly from the
dinode structure so it is not necessary to translate it,
Suggested by: bde
MFC after: 3 days
After r252890 we are naively attempting to pass through the
inode flags. This is technically incorrect as the ext2
inode flags don't match the UFS/system values used in
FreeBSD and a clean conversion is needed.
Some filtering was left in place so the change didn't cause
significant changes in FreeBSD but some of the garbage passed
is likely to be the cause for warning messages in linux.
Fix the issue by resetting the flags before conversion as was
done previously. This also means we will not pass the EXT4_*
inode flags into FreeBSD's inode.
PR: kern/185448
MFC after: 3 days
According to online documentation [1], Ext4 has two new "special"
inodes so add the new exclude and replica inodes.
Reference:
[1] https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout
Reported by: Mike Ma
MFC after: 3 weeks
di_extsize is the EA size and as such it should be unsigned.
Adjust related types for consistency.
Reviewed by: mckusick (previous version)
MFC after: 3 weeks
Our code does not consider yet the case of hash collisions. This
is a rather annoying situation where two or more files that
happen to have the same hash value will not appear accessible.
The situation is not difficult to work-around but given that things
will just work without enabling htree we will save possible
embarrassments for the next release.
Reported by: Kevin Lo
Previous bandaid was not appropriate and didn't really work for
all platforms. While here, cleanup the surrounding code to match
ffs_checkoverlap()
Reported by: dim, jmallet and bde
MFC after: 3 weeks
Add definitions for e2fs_daddr_t, e4fs_daddr_t in addition
to the already existing e2fs_lbn_t and adjust them for ext4.
Other than making the code more readable these changes should
fix problems related to big filesystems.
Setting the proper types can be tricky so the process was
helped by looking at UFS. In our implementation, logical block
numbers can be negative and the code depends on it. In ext2,
block numbers are unsigned so it is convenient to keep
e2fs_daddr_t unsigned and use the complete 32 bits. In the
case of e4fs_daddr_t, while the value should be unsigned, for
ext4 we only need to support 48 bits so preserving an extra
bit from the sign is not an issue.
While here also drop the ext2_setblock() prototype that was
never used.
Discussed with: mckusick, bde
MFC after: 3 weeks
Basic support for extents was implemented by Zheng Liu as part
of his Google Summer of Code in 2010. This support is read-only
at this time.
In addition to extents we also support the huge_file extension
for read-only purposes. This works nicely with the additional
support for birthtime/nanosec timestamps and dir_index that
have been added lately.
The implementation may not work for all ext4 filesystems as
it doesn't support some features that are being enabled by
default on recent linux like flex_bg. Nevertheless, the feature
should be very useful for migration or simple access in
filesystems that have been converted from ext2/3 or don't use
incompatible features.
Special thanks to Zheng Liu for his dedication and continued
work to support ext2 in FreeBSD.
Submitted by: Zheng Liu (lz@)
Reviewed by: Mike Ma, Christoph Mallon (previous version)
Sponsored by: Google Inc.
MFC after: 3 weeks
The htree implementation uses code derived from the
RSA Data Security, Inc. MD4 Message-Digest Algorithm.
Add a proper licensing statement for the code and clarify
the corresponding comments.
Approved by: core (hrs)
as in <sys/dirent.h>
ext2_readdir() has always been very fs specific and different
with respect to its ufs_ counterpart. Recent changes from UFS
have made it possible to share more closely the implementation.
MFUFS r252438:
Always start parsing at DIRBLKSIZ aligned offset, skip first entries if
uio_offset is not DIRBLKSIZ aligned. Return EINVAL if buffer is too
small for single entry.
Preallocate buffer for cookies.
Skip entries with zero inode number.
Reviewed by: gleb, Zheng Liu
MFC after: 1 month
Merge from UFS r231313:
This change first attempts the uiomove() to the newly allocated
(and dirty) buffer and only zeros it if the uiomove() fails. The
effect is to eliminate the gratuitous zeroing of the buffer in
the usual case where the uiomove() successfully fills it.
MFC after: 3 days
This is a port of NetBSD's GSoC 2012 Ext3 HTree directory indexing
by Vyacheslav Matyushin. It was cleaned up and enhanced for FreeBSD
by Zheng Liu (lz@).
This is an excellent example of work shared among different projects:
Vyacheslav was able to look at an early prototype from Zheng Liu who
was also able to check the code from Haiku (with permission).
As in linux, the feature is not available by default and must be
enabled explicitly with tune2fs. We still do not support the
workarounds required in readdir for NFS.
Submitted by: Zheng Liu
Tested by: Mike Ma
Sponsored by: Google Inc.
MFC after: 1 week
r156418:
Don't set IN_CHANGE and IN_UPDATE on inodes for potentially suspended
file systems. This could cause deadlocks when creating snapshots.
(We can't do snapshots on ext2fs but it is useful to keep things in sync).
r183079:
- Only set i_offset in the parent directory's i-node during a lookup for
non-LOOKUP operations.
- Relax a VOP assertion for a DELETE lookup.
r187528:
Move the code from ufs_lookup.c used to do dotdot lookup, into
the helper function. It is supposed to be useful for any filesystem
that has to unlock dvp to walk to the ".." entry in lookup routine.
MFC after: 5 days
In line to what is done in UFS, define an internal type
e2fs_lbn_t for the logical block numbers.
This change is basically a no-op as the new type is unchanged
(int32_t) but it may be useful as bumping this may be required
for ext4fs.
Also, as pointed out by Bruce Evans:
-Use daddr_t for daddr in ext2_bmaparray(). This seems to
improve reliability with the reallocblks option.
- Add a cast to the fsbtodb() macro as in UFS.
Reviewed by: bde
MFC after: 3 days
In the ext2fs driver we have a mixture of headers:
- The ext2_ prefixed headers have strong influence from NetBSD
and are carry specific ext2/3/4 information.
- The unprefixed headers are inspired on UFS and carry implementation
specific information.
Do some small adjustments so that the information is easier to
find coming from either UFS or the NetBSD implementation.
MFC after: 3 days
While the changes in r245820 are in line with the ext2 spec,
the code derived from UFS can use negative values so it is
better to relax some types to keep them as they were, and
somewhat more similar to UFS. While here clean some casts.
Some of the original types are still wrong and will require
more work.
Discussed with: bde
MFC after: 3 days
The superblock in ext2fs defines all the fields as unsigned but for
some reason the in-memory superblock was carrying e2fs_bpg and
e2fs_isize as signed.
We should preserve the specified types for consistency.
MFC after: 5 days
Uncover some, previously reserved, fields that are used by Ext4.
These are currently unused but it is good to have them for future
reference.
Reviewed by: bde
MFC after: 3 days
- Use a shared bufobj lock in getblk() and inmem().
- Convert softdep's lk to rwlock to match the bufobj lock.
- Move INFREECNT to b_flags and protect it with the buf lock.
- Remove unnecessary locking around bremfree() and BKGRDINPROG.
Sponsored by: EMC / Isilon Storage Division
Discussed with: mckusick, kib, mdf
- Don't insert BKGRDMARKER bufs into the splay or dirty/clean buf lists.
No consumers need to find them there and it complicates the tree.
These flags are all FFS specific and could be moved out of the buf
cache.
- Use pbgetvp() and pbrelvp() to associate the background and journal
bufs with the vp. Not only is this much cheaper it makes more sense
for these transient bufs.
- Fix the assertions in pbget* and pbrel*. It's not safe to check list
pointers which were never initialized. Use the BX flags instead. We
also check B_PAGING in reassignbuf() so this should cover all cases.
Discussed with: kib, mckusick, attilio
Sponsored by: EMC / Isilon Storage Division
cluster_write() and cluster_wbuild() functions. The flags to be
allowed are a subset of the GB_* flags for getblk().
Sponsored by: The FreeBSD Foundation
Tested by: pho
e2fs_maxcontig was modelled after UFS when bringing the
"Orlov allocator" to ext2. On UFS fs_maxcontig is kept in the
superblock and is used by userland tools (fsck and growfs),
In ext2 this information is volatile so it is not available
for userland tools, so in this case it doesn't have sense
to carry it in the in-memory superblock.
Also remove a pointless check for MAX(1, x) > 0.
Submitted by: Christoph Mallon
MFC after: 2 weeks