Commit Graph

747 Commits

Author SHA1 Message Date
trasz
402e3baade Move checking against RLIMIT_FSIZE into one place, vn_rlimit_fsize().
Reviewed by:	kib
2010-05-05 16:44:25 +00:00
lulf
8ab7715033 Bring in the ext2fs work done by Aditya Sarawgi during and after Google Summer
of Code 2009:

- BSDL block and inode allocation policies for ext2fs. This involves the use
  FFS1 style block and inode allocation for ext2fs. Preallocation was removed
  since it was GPL'd.
- Make ext2fs MPSAFE by introducing locks to per-mount datastructures.
- Fixes for kern/122047 PR.
- Various small bugfixes.
- Move out of gnu/ directory.

Sponsored by:   Google Inc.
Submitted by:	Aditya Sarawgi <sarawgi.aditya AT SPAMFREE gmail DOT com>
2010-01-14 14:30:54 +00:00
mbr
7450f52a57 Remove extraneous semicolons, no functional changes.
Submitted by:	Marc Balmer <marc@msys.ch>
MFC after:	1 week
2010-01-07 21:01:37 +00:00
trasz
c74ecb3d07 Remove unused code. 2009-12-03 18:16:14 +00:00
jh
62779a0a26 File flags handling fixes for ext2fs:
- Disallow setting of flags not supported by ext2fs.
- Map EXT2_APPEND_FL to SF_APPEND.
- Map EXT2_IMMUTABLE_FL to SF_IMMUTABLE.
- Map EXT2_NODUMP_FL to UF_NODUMP.

Note that ext2fs doesn't support user settable append and immutable flags.
EXT2_NODUMP_FL is an user settable flag also on Linux.

PR:		kern/122047
Reported by:	Ighighi
Submitted by:	Aditya Sarawgi (original version)
Reviewed by:	bde
Approved by:	trasz (mentor)
2009-11-05 04:51:38 +00:00
rdivacky
56ea66b4d9 Fix the build by using proper format.
Pointy hat: me
Approved by: kib
2009-06-25 16:48:13 +00:00
rdivacky
fdb650857e Switch cmd argument of ioctl to u_long as elsewhere in the kernel.
Propagate this change down the callchain.

Approved by:	kan (maintainer)
Approved by:	ed (mentor)
2009-06-25 08:52:20 +00:00
kib
b8351fcda2 Do not use casts (int *)0 and (struct thread *)0 for the arguments of
vn_rdwr, use NULL.

Reviewed by:	jhb
MFC after:	1 week
2009-06-16 15:13:45 +00:00
kib
754aba6b97 Fix r193923 by noting that type of a_fp is struct file *, not int.
It was assumed that r193923 was trivial change that cannot be done
wrong.

MFC after:	2 weeks
2009-06-10 14:24:31 +00:00
kib
2a37bc559b s/a_fdidx/a_fp/ for VOP_OPEN comments that inline struct vop_open_args
definition.

Discussed with:	bde
MFC after:	2 weeks
2009-06-10 14:09:05 +00:00
stas
cd4e8f2232 - Outindent long printf lines instead of splitting them in the
middle of senetences. This also makes the code more consistent
  with the corresponding FFS code.
- Use 2-space sentences breaks consistently.

Suggested by:	bde
2009-06-07 08:42:26 +00:00
stas
673e1d1fc9 - Remove unused sparc64-bitops.h file. Our ext2fs code doesn't use
sparc64-specific bitops implemetations and relies on generic ones.
  Furthermore, bitops implementations present in sparc64-bitops.h
  are written in C similarly to generic bitops.
2009-06-03 17:30:10 +00:00
stas
25e7b6da03 - Style(9) improvements.
- Convert all K&R definitions to ANSI equialents.
- Retire bsd_malloc and bsd_free macros and
  use malloc/free directly.
- Drop some unused debugging calls.

This commit brings no functional changes.
2009-06-03 14:18:37 +00:00
stas
d6b21c236c - Sync our copies of ext2fs Linux headers to current Linux versions.
Minimize differencies between our ext2fs headers and relevant Linux
  versions by using EXT2_SB macro to access the superblock fields. Most
  of the differencies in access to these fields are now hidden inside
  this macro.
- Rename the s_db_per_group field of ext2fs_sb_info to s_gdb_count
  to reflect the similar change in Linux headers. New name also seem
  to be more appropriate for this field.
- Use proper types for s_first_inode and s_inode_size in-core superblock
  fields. Now they reflec types used in the on-disk superblock version.
- Add support for older filesystem revisions that doesn't have proper
  s_first_ino and s_inode_size fields in the on-disk superblock. In these
  cases predefined values for these fields are used.
- Add simple sanity checks for s_first_inode and s_inode_size correctness.

Reviewed by:	bde (previous version)
MFC after:	2 weeks
2009-06-03 13:25:50 +00:00
kan
35f48e9de6 Remove empty files and do nto try to build them.
Apparently, they are problematic for CTF users.

PR:	119298
Submitted by: Julian H. Stacey
2009-05-18 17:20:24 +00:00
attilio
902219327c FreeBSD right now support 32 CPUs on all the architectures at least.
With the arrival of 128+ cores it is necessary to handle more than that.
One of the first thing to change is the support for cpumask_t that needs
to handle more than 32 bits masking (which happens now).  Some places,
however, still assume that cpumask_t is a 32 bits mask.
Fix that situation by using always correctly cpumask_t when needed.

While here, remove the part under STOP_NMI for the Xen support as it
is broken in any case.

Additively make ipi_nmi_pending as static.

Reviewed by:	jhb, kmacy
Tested by:	Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
2009-05-14 17:43:00 +00:00
attilio
1dcb84131b Remove the thread argument from the FSD (File-System Dependent) parts of
the VFS.  Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.

In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.

While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.

VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled.  Bump __FreeBSD_version in order to signal such
situation.
2009-05-11 15:33:26 +00:00
kib
7ad4235008 Fix two issues with bufdaemon, often causing the processes to hang in
the "nbufkv" sleep.

First, ffs background cg group block write requests a new buffer for
the shadow copy. When ffs_bufwrite() is called from the bufdaemon due
to buffers shortage, requesting the buffer deadlock bufdaemon.
Introduce a new flag for getnewbuf(), GB_NOWAIT_BD, to request getblk
to not block while allocating the buffer, and return failure
instead. Add a flag argument to the geteblk to allow to pass the flags
to getblk(). Do not repeat the getnewbuf() call from geteblk if buffer
allocation failed and either GB_NOWAIT_BD is specified, or geteblk()
is called from bufdaemon (or its helper, see below). In
ffs_bufwrite(), fall back to synchronous cg block write if shadow
block allocation failed.

Since r107847, buffer write assumes that vnode owning the buffer is
locked. The second problem is that buffer cache may accumulate many
buffers belonging to limited number of vnodes. With such workload,
quite often threads that own the mentioned vnodes locks are trying to
read another block from the vnodes, and, due to buffer cache
exhaustion, are asking bufdaemon for help. Bufdaemon is unable to make
any substantial progress because the vnodes are locked.

Allow the threads owning vnode locks to help the bufdaemon by doing
the flush pass over the buffer cache before getnewbuf() is going to
uninterruptible sleep. Move the flushing code from buf_daemon() to new
helper function buf_do_flush(), that is called from getnewbuf().  The
number of buffers flushed by single call to buf_do_flush() from
getnewbuf() is limited by new sysctl vfs.flushbufqtarget.  Prevent
recursive calls to buf_do_flush() by marking the bufdaemon and threads
that temporarily help bufdaemon by TDP_BUFNEED flag.

In collaboration with:	pho
Reviewed by:	 tegge (previous version)
Tested by:	 glebius, yandex ...
MFC after:	 3 weeks
2009-03-16 15:39:46 +00:00
das
ab8f6aab1a Don't declare bin_search() as an inline function, since there's no
inline definition of it.
2009-03-08 06:14:33 +00:00
ed
322413c46c Add memmove() to the kernel, making the kernel compile with Clang.
When copying big structures, LLVM generates calls to memmove(), because
it may not be able to figure out whether structures overlap. This caused
linker errors to occur. memmove() is now implemented using bcopy().
Ideally it would be the other way around, but that can be solved in the
future. On ARM we don't do add anything, because it already has
memmove().

Discussed on:	arch@
Reviewed by:	rdivacky
2009-02-28 16:21:25 +00:00
stas
963ae12224 - Eliminate warnings in debug print macros by explicitly converting all
field to unsigned long.
2009-01-18 15:10:46 +00:00
stas
4bddbf2189 - Whitespace fixes.
- s_bmask field doesn't exist.
- Use correct flags in debug printf.
2009-01-18 14:54:46 +00:00
stas
4a91cd2088 - Obtain inode sizes and location of the first inode based on the contents
of superblock rather than using hardcoded values. This fixes ext2fs on
  filesystems with inode sized other than 128.

Submitted by:	Alex Lyashkov <Alexey.Lyashkov@Sun.COM> (based on)
MFC after:	2 weeks
2009-01-18 14:04:56 +00:00
kib
67a76167a4 Do not incorrectly add the low 5 bits of the offset to the resulting
position of the found zero bit.

Submitted by:	Jaakko Heinonen <jh saunalahti fi>
MFC after:	2 weeks
2009-01-04 15:56:49 +00:00
trasz
b2515a861b According to phk@, VOP_STRATEGY should never, _ever_, return
anything other than 0.  Make it so.  This fixes
"panic: VOP_STRATEGY failed bp=0xc320dd90 vp=0xc3b9f648",
encountered when writing to an orphaned filesystem.  Reason
for the panic was the following assert:
KASSERT(i == 0, ("VOP_STRATEGY failed bp=%p vp=%p", bp, bp->b_vp));
at vfs_bio:bufstrategy().

Reviewed by:	scottl, phk
Approved by:	rwatson (mentor)
Sponsored by:	FreeBSD Foundation
2008-12-16 21:13:11 +00:00
trasz
7f8309245b Adapt to accmode_t changes.
Approved by:	rwatson (mentor), kan
2008-11-14 09:58:16 +00:00
attilio
e1f493235e Improve VFS locking:
- Implement real draining for vfs consumers by not relying on the
  mnt_lock and using instead a refcount in order to keep track of lock
  requesters.
- Due to the change above, remove the mnt_lock lockmgr because it is now
  useless.
- Due to the change above, vfs_busy() is no more linked to a lockmgr.
  Change so its KPI by removing the interlock argument and defining 2 new
  flags for it: MBF_NOWAIT which basically replaces the LK_NOWAIT of the
  old version (which was unlinked from the lockmgr alredy) and
  MBF_MNTLSTLOCK which provides the ability to drop the mountlist_mtx
  once the mnt interlock is held (ability still desired by most consumers).
- The stub used into vfs_mount_destroy(), that allows to override the
  mnt_ref if running for more than 3 seconds, make it totally useless.
  Remove it as it was thought to work into older versions.
  If a problem of "refcount held never going away" should appear, we will
  need to fix properly instead than trust on such hackish solution.
- Fix a bug where returning (with an error) from dounmount() was still
  leaving the MNTK_MWAIT flag on even if it the waiters were actually
  woken up. Just a place in vfs_mount_destroy() is left because it is
  going to recycle the structure in any case, so it doesn't matter.
- Remove the markercnt refcount as it is useless.

This patch modifies VFS ABI and breaks KPI for vfs_busy() so manpages and
__FreeBSD_version will be modified accordingly.

Discussed with:	kib
Tested by:	pho
2008-11-02 10:15:42 +00:00
trasz
0ad8692247 Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary
to add more V* constants, and the variables changed by this patch were often
being assigned to mode_t variables, which is 16 bit.

Approved by:	rwatson (mentor)
2008-10-28 13:44:11 +00:00
kib
41033a8f94 Garbage-collect ext2_kqfilter vop that is now a copy of vop_stdkqfilter(). 2008-10-28 12:15:11 +00:00
des
66f807ed8b Retire the MALLOC and FREE macros. They are an abomination unto style(9).
MFC after:	3 months
2008-10-23 15:53:51 +00:00
attilio
b8bf37e585 Remove the struct thread unuseful argument from bufobj interface.
In particular following functions KPI results modified:
- bufobj_invalbuf()
- bufsync()

and BO_SYNC() "virtual method" of the buffer objects set.
Main consumers of bufobj functions are affected by this change too and,
in particular, functions which changed their KPI are:
- vinvalbuf()
- g_vfs_close()

Due to the KPI breakage, __FreeBSD_version will be bumped in a later
commit.

As a side note, please consider just temporary the 'curthread' argument
passing to VOP_SYNC() (in bufsync()) as it will be axed out ASAP

Reviewed by:	kib
Tested by:	Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
2008-10-10 21:23:50 +00:00
kib
a127656dea fdescfs, devfs, mqueuefs, nfs, portalfs, pseudofs, tmpfs and xfs
initialize the vattr structure in VOP_GETATTR() with VATTR_NULL(),
vattr_null() or by zeroing it. Remove these to allow preinitialization
of fields work in vn_stat(). This is needed to get birthtime initialized
correctly.

Submitted by:   Jaakko Heinonen <jh saunalahti fi>
Discussed on:   freebsd-fs
MFC after:	1 month
2008-09-20 19:50:52 +00:00
kib
81d455e702 Initialize va_flags and va_filerev properly in VOP_GETATTR(). Don't
initialize va_vaflags and va_spare because they are not part of the
VOP_GETATTR() API. Also don't initialize birthtime to ctime or zero.

Submitted by:   Jaakko Heinonen <jh saunalahti fi>
Reviewed by:	bde
Discussed on:   freebsd-fs
MFC after:	1 month
2008-09-20 19:46:45 +00:00
kib
039c5da1b2 Garbage-collect vn_write_suspend_wait().
Suggested and reviewed by:	tegge
Tested by:	pho
MFC after:	1 month
2008-09-16 11:09:26 +00:00
sam
05a7094fc1 Make ddb command registration dynamic so modules can extend
the command set (only so long as the module is present):
o add db_command_register and db_command_unregister to add and remove
  commands, respectively
o replace linker sets with SYSINIT's (and SYSUINIT's) that register
  commands
o expose 3 list heads: db_cmd_table, db_show_table, and db_show_all_table
  for registering top-level commands, show operands, and show all operands,
  respectively

While here also:
o sort command lists
o add DB_ALIAS, DB_SHOW_ALIAS, and DB_SHOW_ALL_ALIAS to add aliases
  for existing commands
o add "show all trace" as an alias for "show alltrace"
o add "show all locks" as an alias for "show alllocks"

Submitted by:	Guillaume Ballet <gballet@gmail.com> (original version)
Reviewed by:	jhb
MFC after:	1 month
2008-09-15 22:45:14 +00:00
trasz
9303940ffe Remove VSVTX, VSGID and VSUID. This should be a no-op,
as VSVTX == S_ISVTX, VSGID == S_ISGID and VSUID == S_ISUID.

Approved by:	rwatson (mentor)
2008-09-10 13:16:41 +00:00
attilio
e2ca413d09 Decontextualize vfs_busy(), vfs_unbusy() and vfs_mount_alloc() functions.
Manpages are updated accordingly.

Tested by:	Diego Sardina <siarodx at gmail dot com>
2008-08-31 14:26:08 +00:00
attilio
dbf35e279f Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread
was always curthread and totally unuseful.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
2008-08-28 15:23:18 +00:00
attilio
823ce79a5b - Disallow XFS mounting in write mode. The write support never worked really
and there is no need to maintain it.
- Fix vn_get() in order to let it call vget(9) with a valid locking
  request.  vget(9) returns the vnode locked in order to prevent recycling,
  but in this case internal XFS locks alredy prevent it from happening, so
  it is safe to drop the vnode lock before to return by vn_get().
- Add a VNASSERT() in vget(9) in order to catch malformed locking requests.

Discussed with:	kan, kib
Tested by:	Lothar Braun <lothar at lobraun dot de>
2008-07-21 23:01:09 +00:00
kib
52243403eb Move the head of byte-level advisory lock list from the
filesystem-specific vnode data to the struct vnode. Provide the
default implementation for the vop_advlock and vop_advlockasync.
Purge the locks on the vnode reclaim by using the lf_purgelocks().
The default implementation is augmented for the nfs and smbfs.
In the nfs_advlock, push the Giant inside the nfs_dolock.

Before the change, the vop_advlock and vop_advlockasync have taken the
unlocked vnode and dereferenced the fs-private inode data, racing with
with the vnode reclamation due to forced unmount. Now, the vop_getattr
under the shared vnode lock is used to obtain the inode size, and
later, in the lf_advlockasync, after locking the vnode interlock, the
VI_DOOMED flag is checked to prevent an operation on the doomed vnode.

The implementation of the lf_purgelocks() is submitted by dfr.

Reported by:	kris
Tested by:	kris, pho
Discussed with:	jeff, dfr
MFC after:	2 weeks
2008-04-16 11:33:32 +00:00
jhb
20cadd93f0 Fix a nit with the 'nofoo' options where 'foo' is mapped to 'nonofoo'
(such as 'atime' vs 'noatime').  The filesystems will always see either
'nofoo' or 'nonofoo', never plain 'foo'.  As such, their list of valid
mount options should include 'nofoo' instead of 'foo'.  With this fix,
you can do 'mount -u -o atime' on a FFS filesystem that isn't marked as
noatime without getting an error.  You can also update a noatime FFS
filesystem mounted via mount(2) (e.g. 6.x /sbin/mount binary) to 'atime'
using nmount(2) (e.g. 7.x /sbin/mount binary).

MFC after:	1 week
Reviewed by:	crodig
2008-03-26 20:48:07 +00:00
attilio
0d54671a48 Introduce some functions in the vnode locks namespace and in the ffs
namespace in order to handle lockmgr fields in a controlled way instead
than spreading all around bogus stubs:
- VN_LOCK_AREC() allows lock recursion for a specified vnode
- VN_LOCK_ASHARE() allows lock sharing for a specified vnode

In FFS land:
- BUF_AREC() allows lock recursion for a specified buffer lock
- BUF_NOREC() disallows recursion for a specified buffer lock

Side note: union_subr.c::unionfs_node_update() is the only other function
directly handling lockmgr fields. As this is not simple to fix, it has
been left behind as "sole" exception.
2008-02-24 16:38:58 +00:00
attilio
456bfb1f0f - Add real assertions to lockmgr locking primitives.
A couple of notes for this:
  * WITNESS support, when enabled, is only used for shared locks in order
    to avoid problems with the "disowned" locks
  * KA_HELD and KA_UNHELD only exists in the lockmgr namespace in order
    to assert for a generic thread (not curthread) owning or not the
    lock.  Really, this kind of check is bogus but it seems very
    widespread in the consumers code.  So, for the moment, we cater this
    untrusted behaviour, until the consumers are not fixed and the
    options could be removed (hopefully during 8.0-CURRENT lifecycle)
  * Implementing KA_HELD and KA_UNHELD (not surported natively by
    WITNESS) made necessary the introduction of LA_MASKASSERT which
    specifies the range for default lock assertion flags
  * About other aspects, lockmgr_assert() follows exactly what other
    locking primitives offer about this operation.

- Build real assertions for buffer cache locks on the top of
  lockmgr_assert().  They can be used with the BUF_ASSERT_*(bp)
  paradigm.

- Add checks at lock destruction time and use a cookie for verifying
  lock integrity at any operation.

- Redefine BUF_LOCKFREE() in order to not use a direct assert but
  let it rely on the aforementioned destruction time check.

KPI results evidently broken, so __FreeBSD_version bumping and
manpage update result necessary and will be committed soon.

Side note: lockmgr_assert() will be used soon in order to implement
real assertions in the vnode namespace replacing the legacy and still
bogus "VOP_ISLOCKED()" way.

Tested by:      kris (earlier version)
Reviewed by:    jhb
2008-02-13 20:44:19 +00:00
attilio
7213f4c32b Cleanup lockmgr interface and exported KPI:
- Remove the "thread" argument from the lockmgr() function as it is
  always curthread now
- Axe lockcount() function as it is no longer used
- Axe LOCKMGR_ASSERT() as it is bogus really and no currently used.
  Hopefully this will be soonly replaced by something suitable for it.
- Remove the prototype for dumplockinfo() as the function is no longer
  present

Addictionally:
- Introduce a KASSERT() in lockstatus() in order to let it accept only
  curthread or NULL as they should only be passed
- Do a little bit of style(9) cleanup on lockmgr.h

KPI results heavilly broken by this change, so manpages and
FreeBSD_version will be modified accordingly by further commits.

Tested by: matteo
2008-01-24 12:34:30 +00:00
attilio
caa2ca048b - Introduce the function lockmgr_recursed() which returns true if the
lockmgr lkp, when held in exclusive mode, is recursed
- Introduce the function BUF_RECURSED() which does the same for bufobj
  locks based on the top of lockmgr_recursed()
- Introduce the function BUF_ISLOCKED() which works like the counterpart
  VOP_ISLOCKED(9), showing the state of lockmgr linked with the bufobj

BUF_RECURSED() and BUF_ISLOCKED() entirely replace the usage of bogus
BUF_REFCNT() in a more explicative and SMP-compliant way.
This allows us to axe out BUF_REFCNT() and leaving the function
lockcount() totally unused in our stock kernel. Further commits will
axe lockcount() as well as part of lockmgr() cleanup.

KPI results, obviously, broken so further commits will update manpages
and freebsd version.

Tested by: kris (on UFS and NFS)
2008-01-19 17:36:23 +00:00
attilio
71b7824213 VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
2008-01-13 14:44:15 +00:00
attilio
18d0a0dd51 vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by:	Diego Sardina <siarodx at gmail dot com>,
		Andrea Di Pasquale <whyx dot it at gmail dot com>
2008-01-10 01:10:58 +00:00
rodrigc
0591a64e3d Remove duplicate "union" from ext2_opts.
Noticed by:	bde
2007-10-27 16:14:33 +00:00
alfred
3a60df401c Get rid of qaddr_t.
Requested by: bde
2007-10-16 10:54:55 +00:00
cognet
0b8ac2d969 Some times ago, vfs_getopts() was changed, so that it would set error to
ENOENT if the option wasn't provided, instead of setting it to 0.
xfs however didn't catch up on this, so it assumed something went bad if
vfs_getopts() sets the error to non-zero, and just returns the error.
Unbreak xfs mount by just ignoring the error if vfs_getopts() sets the
error to ENOENT, as we should have sane defaults.

Reviewed by:    kan
Approved by:    re (rwatson)
Tested by:      rpaulo
2007-08-20 15:33:22 +00:00