Commit Graph

168 Commits

Author SHA1 Message Date
Kevin Lo
2b3506d919 Fix comment. 2016-04-08 04:29:05 +00:00
Edward Tomasz Napierala
ae34b6ff96 Add four new RCTL resources - readbps, readiops, writebps and writeiops,
for limiting disk (actually filesystem) IO.

Note that in some cases these limits are not quite precise. It's ok,
as long as it's within some reasonable bounds.

Testing - and review of the code, in particular the VFS and VM parts - is
very welcome.

MFC after:	1 month
Relnotes:	yes
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D5080
2016-04-07 04:23:25 +00:00
Kevin Lo
df04a188af Update comment: Linux does set a randomized generation number of an inode
on ext2/3/4.

While here use arc4random() instead of random().

Reviewed by:	pfg
MFC after:	3 days
2016-04-01 03:21:01 +00:00
Kevin Lo
98a768596e Update superblock and inode structs for ext4.
Reviewed by:	pfg
2016-03-28 07:44:55 +00:00
Pedro F. Giffuni
308c3c240f Ext2: cleanup setting of ctime/mtime/birthtime.
This adopts the same change as r291936 for UFS.
Directly clear IN_ACCESS or IN_UPDATE when user supplied the time, and
copy the value into the inode.

This keeps the behaviour cleaner and is consistent with UFS.

Reviewed by:	bde
MFC after:	1 month (only 10)
2016-02-19 15:53:08 +00:00
Pedro F. Giffuni
00e24e4173 ext2fs: Remove panics for rename() race conditions.
Sync with r84642 from UFS:

The panics are inappropriate because the IN_RENAME flag only fixes a
few of the huge number of race conditions that can result in the
source path becoming invalid even prior to the VOP_RENAME() call.

Found accidentally while checking an issue from PVS Static Analysis.

MFC after:	3 days
2016-02-14 19:52:50 +00:00
Pedro F. Giffuni
a633908d21 Ext4: Use boolean type instead of '0' and '1'
There are precedents of uses of bool in the kernel and
it is incorrect style to use integers as replacement for
a boolean type.
2016-02-11 15:27:14 +00:00
Pedro F. Giffuni
78f6ea5440 Ext4: fix handling of files with sparse blocks before extent's index.
This is ongoing work from Damjan Jovanovic to improve ext4 read support
with sparse files:

Keep track of the first and last block in each extent as it descends down
the extent tree, thus being able to work out that some blocks are sparse
earlier. This solves an issue on r293680.

In ext4_bmapext() start supporting the runb parameter, which appears to be
the number of adjacent blocks prior to the block being converted in the
same way that runp is the number of blocks after, speding up random access
to mmaped files.

PR:	206652
2016-02-11 00:34:11 +00:00
Pedro F. Giffuni
817dd2573e Revert r294695:
ext2fs: passthrough any extra timestamps to the dinode struct.

While it passed the classic testing, the change appears to have
caused some regression and still requires some more precautions.

PR:		206820
MFC after:	3 days
2016-02-03 14:31:23 +00:00
Pedro F. Giffuni
afbad87898 ext2fs: passthrough any extra timestamps to the dinode struct.
In general we don't trust any of the extended timestamps unless the
EXT2F_ROCOMPAT_EXTRA_ISIZE feature is set. However, in the case where
we freshly allocated a new inode the information is valid and it is
better to pass it along instead of leaving the value undefined.

This should have no practical effect but should reduce the amount of
garbage if EXT2F_ROCOMPAT_EXTRA_ISIZE is set, like in cases where the
filesystem is converted from ext3 to ext4.

MFC after:	4 days
2016-01-24 23:24:47 +00:00
Pedro F. Giffuni
386b134364 ext2: rename some directory index constants.
Missed from r294653.

Pointyhat:	me
2016-01-24 04:30:30 +00:00
Pedro F. Giffuni
c22ff471b4 Fix comment. 2016-01-24 02:44:00 +00:00
Pedro F. Giffuni
9b58c8019f Rename some directory index constants.
Directory index was introduced in ext3. We don't always use the
prefix to denote the ext2 variant they belong to but when we
do we should try to be accurate.
2016-01-24 02:41:49 +00:00
Pedro F. Giffuni
e08ad8f068 ext2: Initialize i_flag after allocation.
We use i_flag to carry some flags like IN_E4INDEX which newer
ext2fs variants uses internally.

fsck.ext3 rightfully complains after our implementation tags
non-directory inodes with INDEX_FL.

Initializing i_flag during allocation removes the noise factor
and quiets down fsck.

Patch from:	Damjan Jovanovic
PR:		206530
2016-01-24 02:25:41 +00:00
Pedro F. Giffuni
9824e4adbe ext2fs: Bring back the htree dir_index implementation.
The htree dir_index is perhaps one of the most characteristic
features of the linux ext3 implementation. It was removed
in r281670, due to repeated bug reports.

Damjan Jovanic detected and fixed three bugs and did some
stress testing by building Apache OpenOffice on top of it
so it is now in good shape to bring back.

Differential Revision:	https://reviews.freebsd.org/D5007

Submitted by:	Damjan Jovanovic
Reviewed by:	pfg
Tested by:	pho
Relnotes:	Yes
MFC after:	2 months (only 10.x)
2016-01-21 14:50:28 +00:00
Pedro F. Giffuni
daf884fa9f ext4: mount panic from freeing invalid pointers
Initialize the struct with those fields to zeroes on allocation,
preventing the panic.

Patch by:	Damjan Jovanovic.

PR:		206056
MFC after:	3 days
2016-01-11 19:25:43 +00:00
Pedro F. Giffuni
e813d9d7fa ext4: add support for reading sparse files
Add support for sparse files in ext4. Also implement read-ahead, which
greatly increases the performance when transferring files from ext4.

Both features implemented by Damjan Jovanovic.

PR:		205816
MFC after:	1 week
2016-01-11 19:14:55 +00:00
Pedro F. Giffuni
7135ca50c1 ext2fs: reading mmaped file in Ext4 causes panic
Always call brelse(path.ep_bp), fixing reading EXT4 files using mmap().

Patch by Damjan Jovanovic.

PR:		205938
MFC after:	1 week
2016-01-07 21:43:43 +00:00
Pedro F. Giffuni
26069aec57 ext2: recognize ext4 INCOMPAT_RECOVER flag
This is a flag specific for journalling in ext4.
Add it to the list of ext4 features we ignore for
read-only purposes.

PR:		205668
MFC after:	1 week
2015-12-29 15:51:52 +00:00
Jeff Roberson
7d07bfd8a3 - Remove some dead code copied from ffs. 2015-07-29 03:06:08 +00:00
Pedro F. Giffuni
e54a659a26 Provide VOP_GETPAGES_ASYNC() for extfs.
Merge the filesystem specific part from r274914 to ext2fs.

I only did regular testing with the change but UFS and our ext2fs
are similar enough that the code should just work with the new
sendfile.

Discussed with:	glebius
2015-05-28 21:06:59 +00:00
Pedro F. Giffuni
f738ee4825 Drop experimental dir_index support.
The htree directory index is a highly desirable feature for research
purposes and was meant to improve performance in our ext2/3 driver.
Unfortunately our implementation has two problems:

- It never really delivered any performance improvement.
- It appears to corrupt the filesystem in undetermined circumstances.

Strictly speaking dir_index is not required for read/write support in
ext2/3 and our limited ext4 support still works fine without it.

Regain stability in the ext2 driver by removing it. We may need it back
(fixed) if we want to support encrypted ext4 support but thanks to the
wonders of version control we can always revert this change and bring it
back.

PR:	191895
PR:	198731
PR:	199309

MFC after:	5 days
2015-04-17 22:26:01 +00:00
Rick Macklem
dda11d4ab9 File systems that do not use the buffer cache (such as ZFS) must
use VOP_FSYNC() to perform the NFS server's Commit operation.
This patch adds a mnt_kern_flag called MNTK_USES_BCACHE which
is set by file systems that use the buffer cache. If this flag
is not set, the NFS server always does a VOP_FSYNC().
This should be ok for old file system modules that do not set
MNTK_USES_BCACHE, since calling VOP_FSYNC() is correct, although
it might not be optimal for file systems that use the buffer cache.

Reviewed by:	kib
MFC after:	2 weeks
2015-04-15 20:16:31 +00:00
Pedro F. Giffuni
e5c356b2a2 ext2fs: Plug small memory leak
free() e2fs_contigdirs upon error.
Undo zeroing of e2fs_gd as this was actually a false positive.

X-MFC with:	278790
2015-02-15 14:25:00 +00:00
Pedro F. Giffuni
6be4cf2244 Reuse value of cursize instead of recalculating.
Reported by:	Clang static checker
MFC after:	1 week
2015-02-15 01:34:00 +00:00
Pedro F. Giffuni
f3ee91ed2b Initialize the allocation of variables related to the ext2 allocator.
The e2fs_gd struct was not being initialized and garbage was
being used for hinting the ext2 allocator variant.
Use malloc to clear the values and also initialize e2fs_contigdirs
during allocation to keep consistency.

While here clean up small style issues.

Reported by:	Clang static analyser
MFC after:	1 week
2015-02-15 01:12:15 +00:00
Enji Cooper
ae266f893f Fix the build when INVARIANTS is defined by restoring bo's definition in
ext2_truncate(..) and by putting it under INVARIANTS ifdefs

X-MFC with: r277354
MFC after: 2 weeks
2015-01-19 07:10:08 +00:00
Pedro F. Giffuni
9a53618ab2 ext2: Garbage-collect some unused variables
Reported by:	clang static analysis
MFC after:	2 weeks
2015-01-19 03:30:45 +00:00
Pedro F. Giffuni
7075482d4a ext2: fix for uninitialized pointer read.
path.ep_bp was being used uninitialized in ext4_ext_find_extent().

CID:		1062344
MFC after:	1 week
2015-01-18 21:18:28 +00:00
Pedro F. Giffuni
955ba37baa Remove dead code.
After the ext2 variant of the "orlov allocator" was implemented,
the case for a negative or zero dirsize disappeared.

Drop the dead code and unsign dirsize given that it can't be
negative anyways.

CID:		1008669
MFC after:	1 week
2015-01-18 20:26:27 +00:00
Pedro F. Giffuni
84b170d298 ext2: cosmetical issues
Minor sorting and note when the cases are expected to fall through.

MFC after:	1 week
2015-01-17 15:19:18 +00:00
Konstantin Belousov
789bdfdbc6 Handle MAKEENTRY cnp flag in the VOP_CREATE(). Curiously, some
fs, e.g. smbfs, already did it.

Tested by:	pho (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-12-21 13:29:33 +00:00
Konstantin Belousov
6c21f6edb8 The VOP_LOOKUP() implementations for CREATE op do not put the name
into namecache, to avoid cache trashing when doing large operations.
E.g., tar archive extraction is not usually followed by access to many
of the files created.

Right now, each VOP_LOOKUP() implementation explicitely knowns about
this quirk and tests for both MAKEENTRY flag presence and op != CREATE
to make the call to cache_enter().  Centralize the handling of the
quirk into VFS, by deciding to cache only by MAKEENTRY flag in VOP.
VFS now sets NOCACHE flag for CREATE namei() calls.

Note that the change in semantic is backward-compatible and could be
merged to the stable branch, and is compatible with non-changed
third-party filesystems which correctly handle MAKEENTRY.

Suggested by:	Chris Torek <torek@pi-coral.com>
Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-12-18 10:01:12 +00:00
Gleb Kurtsou
dde58752db Adjust printf format specifiers for dev_t and ino_t in kernel.
ino_t and dev_t are about to become uint64_t.

Reviewed by:	kib, mckusick
2014-12-17 07:27:19 +00:00
Pedro F. Giffuni
8f87059b41 ext2fs: Fix old out-of-bounds access.
Overrunning buffer pointed to by (caddr_t)&oip->i_db[0] of 48 bytes by
passing it to a function which accesses it at byte offset 59 using
argument 60UL.

The issue was inherited from an older FFS implementation and
fixed there with by merging UFS2 in r98542. We follow the
FFS fix.

Discussed with:	bde
CID:		1007665
MFC after:	3 days
2014-12-09 14:56:00 +00:00
Pedro F. Giffuni
33587684a6 ifdef ext2_print_inode which is not really used.
ext2_print_inode is not really used but it was nice to
have for initial development work. #ifdef it under a
new EXT2FS_DEBUG knob so that we don't spend time
compiling it.

MFC after:	3 days
2014-11-12 16:23:56 +00:00
Konstantin Belousov
df5c9c0411 Do not set IN_ACCESS flag for read-only mounts. The IN_ACCESS
survives remount in rw, also it is set for vnodes on rootfs before
noatime can be set or clock is adjusted.  All conditions result in
wrong atime for accessed vnodes.

Submitted by:	bde
MFC after:	1 week
2014-10-11 19:09:56 +00:00
Konstantin Belousov
d15b55c554 Provide the unique implementation for the VOP_GETPAGES() method used
by ffs and ext2fs.  Remove duplicated call to vm_page_zero_invalid(),
done by VOP and by vm_pager_getpages().  Use vm_pager_free_nonreq().

Reviewed by:	alc (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	6 weeks (after r271596)
2014-09-15 12:28:29 +00:00
Alan Cox
3e5c84e292 We don't need an exclusive object lock on the expected execution path
through {ext2,ffs}_getpages().

Reviewed by:	kib, pfg
MFC after:	6 weeks
Sponsored by:	EMC / Isilon Storage Division
2014-09-13 18:26:13 +00:00
Pedro F. Giffuni
6e582ca2a5 Extra space from r271467.
MFC after:	2 months
2014-09-12 15:54:18 +00:00
Pedro F. Giffuni
1d0fce9bfe ext2fs: add ext2_getpages().
Literally copy/pasted from ffs_getpages().

Tested with:	fsx
MFC after:	2 months
2014-09-12 15:49:21 +00:00
Pedro F. Giffuni
7c1be3f5c7 Revert r269523:
Providing a higher EXT2_LINK_MAX limit is a bad idea for ext2/3.

Discussed with:	bde
2014-08-05 01:25:14 +00:00
Pedro F. Giffuni
55806cca37 set EXT2_LINK_MAX to LINK_MAX
In linux EXT4_LINK_MAX is now 64000.  We can't really do that
since i_nlink and va_nlink are signed so setting higher values
is likely to cause trouble.

This is a system limitation so set the EXT_LINK_MAX to
what the system can handle.

MFC after:	3 days
2014-08-04 16:41:06 +00:00
Konstantin Belousov
65589a29f4 Check for the cross-device cross-link attempt in the VFS, instead of
forcing filesystem VOP_LINK() methods to repeat the code.  In
tmpfs_link(), remove redundand check for the type of the source,
already done by VFS.

Note that NFS server already performs this check before calling
VOP_LINK().

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2014-07-16 14:04:46 +00:00
Pedro F. Giffuni
ca73017a2d Revert r263449;
ext2fs: minor update to the dirpref policy.

The change in UFS r254996, reverted the change as the
older code seems to work better. This was not visible
in local testing but we can trust UFS is vastly more
exercised in diferent environments.
2014-03-21 04:33:38 +00:00
Pedro F. Giffuni
e23c349230 ext2fs: minor update to the dirpref policy.
Bring in a minor change to the dirpref policy based on r248623.

This is pretty minimal change to keep the implementation in
sync with UFS but other parts from the original change are not
directly applicable so don't expect improvements in fsck times.

MFC after:	2 weeks
2014-03-20 21:19:13 +00:00
Pedro F. Giffuni
157b40af8e ext2fs: Fix a bug when sorting htree entries.
This a typo introduced when bringing the original code from NetBSD.

Reported by:	Mike Ma
MFC after:	3 days
2014-03-06 21:02:16 +00:00
Pedro F. Giffuni
c3b76e1345 ext2fs: small formatting fixes.
Remove some redundant spaces.
No functional change.

MFC after:	3 days
2014-03-01 21:22:20 +00:00
Pedro F. Giffuni
3a54024da4 ext2fs: use of tab vs spaces.
Consistently use a single tab after a #define as mentioned in style(9).
Use tabs instead of space for indenting.
Fix a typo: "hash_vesion".

No functional change.

MFC after:	3 days
2014-02-28 21:25:32 +00:00
Pedro F. Giffuni
67da48a15b ext2fs: fully enable ext4 read-only support.
The ext4 developers tend to tag Ext4-specific flags as
"incompatible" even when such features are not relevant for
read-only support.  This is a consequence of the process
though which this filesystem is implemented without design
and the fact that some new features are not extensible to
ext2/3.

Organize the features according to what we support and sort
them so that we can now read-only mount filesystems with
some features that may be found in newly formatted ext4 fs.

Submitted by:	Zheng Liu
Reviewed by:	pfg
MFC after:	5 days
2014-02-22 22:07:16 +00:00