wait until the current suspension is lifted instead of silently returning
success immediately. The consequences of calling vfs_write() resume when
not owning the suspension are not well-defined at best.
Add the vfs_susp_clean() mount method to be called from
vfs_write_resume(). Set it to process_deferred_inactive() for ffs, and
stop calling it manually.
Add the thread flag TDP_IGNSUSP that allows to bypass the suspension
point in the vn_start_write. It is intended for use by VFS in the
situations where the suspender want to do some i/o requiring calls to
vn_start_write(), and this i/o cannot be done later.
Reviewed by: tegge
In collaboration with: pho
MFC after: 1 month
lstat(2) is called on symlinks -- this code appears never to have
worked. The PR this addresses suggests that the intended
original behavior is the right one, but as bde points out in the
PR comments, we do actually support storing a mode on symlinks,
so returning it seems reasonable.
This is consistent with Mac OS X, which despite documentation to
the contrary does return the mode set on a symlink, but not some
other platforms. The Single Unix Spec requires only that the
returned bits be "meaningful", which seems at best unhelpful as
advice goes.
PR: 25018
MFC after: 3 days
which simply want a reference should use vref(). Callers which want
to check validity need to hold a lock while performing any action
based on that validity. vn_lock() would always release the interlock
before returning making any action synchronous with the validity check
impossible.
is requested. Handle this case specially before the while loop.
- Use the held vnode lock to check for VI_DOOMED. The vnode lock and
interlock must both be held to set VI_DOOMED so either one held, even
shared, is sufficient to check it.
No objection by: kib
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.
KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.
Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.
Manpage and FreeBSD_version will be updated through further commits.
As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.
Tested by: Diego Sardina <siarodx at gmail dot com>,
Andrea Di Pasquale <whyx dot it at gmail dot com>
This makes it possible to support ftruncate() on non-vnode file types in
the future.
- 'struct fileops' grows a 'fo_truncate' method to handle an ftruncate() on
a given file descriptor.
- ftruncate() moves to kern/sys_generic.c and now just fetches a file
object and invokes fo_truncate().
- The vnode-specific portions of ftruncate() move to vn_truncate() in
vfs_vnops.c which implements fo_truncate() for vnode file types.
- Non-vnode file types return EINVAL in their fo_truncate() method.
Submitted by: rwatson
- spell 16384 as 16384 and not as BKVASIZE. 16384 is (not quite) just a
magic size that works well in practice. BKVASIZE should be MAXBSIZE
(65536), but is 16384 because i386's don't have enough kva for it to
be MAXBSIZE; 16384 works (not so well) for it for much the same reasons
that it works well in the heuristic.
- expand and/or add comments about this and other details.
- don't explicitly inline this function.
- fix some other style bugs.
- Introduce a finit() which is used to initailize the fields of struct file
in such a way that the ops vector is only valid after the data, type,
and flags are valid.
- Protect f_flag and f_count with atomic operations.
- Remove the global list of all files and associated accounting.
- Rewrite the unp garbage collection such that it no longer requires
the global list of all files and instead uses a list of all unp sockets.
- Mark sockets in the accept queue so we don't incorrectly gc them.
Tested by: kris, pho
from Mac OS X Leopard--rationalize naming for entry points to
the following general forms:
mac_<object>_<method/action>
mac_<object>_check_<method/action>
The previous naming scheme was inconsistent and mostly
reversed from the new scheme. Also, make object types more
consistent and remove spaces from object types that contain
multiple parts ("posix_sem" -> "posixsem") to make mechanical
parsing easier. Introduce a new "netinet" object type for
certain IPv4/IPv6-related methods. Also simplify, slightly,
some entry point names.
All MAC policy modules will need to be recompiled, and modules
not updates as part of this commit will need to be modified to
conform to the new KPI.
Sponsored by: SPARTA (original patches against Mac OS X)
Obtained from: TrustedBSD Project, Apple Computer
Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation
argument from being file descriptor index into the pointer to struct file.
Proposed and reviewed by: jhb
Reviewed by: daichi (unionfs)
Approved by: re (kensmith)
function calls are no more generated for vop_lock.
Rename _vop_lock to vop_lock1 to satisfy tools/vnode_if.awk assumption
about vop naming conventions. This restores pre/post-condition calls.
to become negative. This will detect the underflow when it
happens, instead of having it discovered when the vnode is
taken off the freelist, long after the offending process is long
gone.
specific privilege names to a broad range of privileges. These may
require some future tweaking.
Sponsored by: nCircle Network Security, Inc.
Obtained from: TrustedBSD Project
Discussed on: arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
Alex Lyashkov <umka at sevcity dot net>,
Skip Ford <skip dot ford at verizon dot net>,
Antoine Brodin <antoine dot brodin at laposte dot net>
begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.
This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.
Obtained from: TrustedBSD Project
Sponsored by: SPARTA
vn_finished_write() should also be called only then.
BTW. I fixed two functions here: vn_rdwr() and vn_write(). The latter seems
to be unused.
MFC after: 3 weeks
and use that instead of testing fdidx against -1 to determine if it should
release Giant if Giant was locked due to the requested file residing on a
non-MPSAFE VFS.
Discussed with: jeff
Fix detection of active unlinked files by checking VI_OWEINACT and
VI_DOINGINACT in addition to v_usecount.
Defer inactive handling for unlinked files if the file system is mostly
suspended (secondary writes being blocked).
Perform deferred inactive handling after the file system is resumed.
replacement for vn_write_suspend_wait() to better account for secondary write
processing.
Close race where secondary writes could be started after ffs_sync() returned
but before the file system was marked as suspended.
Detect if secondary writes or softdep processing occurred during vnode sync
loop in ffs_sync() and retry the loop if needed.
caller by saving the stack of the last locker/unlocker in lockmgr. We
also put the stack in KTR at the moment.
Contributed by: Antoine Brodin <antoine.brodin@laposte.net>
used to ensure that we weren't exiting the syscall with a lock still
held. This wasn't safe, however, because we'd already executed a vput()
and on a loaded system the vnode may have been free'd by the time we
assert. This functionality is also handled by the td_locks assert in
userret, which doesn't tell you what the syscall was, but will at least
panic before you deadlock.
Sponsored by: Isilon Systems, Inc.
Discovred by: Peter Holm
Approved by: re (blanket vfs)
as I'd like to get rid of the vxthread.
- Handle lock requests which don't actually want a lock as this is a
much more convenient place to handle this condition than in vget().
These requests simply want to know that VI_DOOMED isn't set.
- Correct a test at the end of vn_lock, if error !=0 should be
if error == 0, this has been broken since I comitted the VI_DOOMED
changes, but no one ran into it because vget() duplicated this
functionality.
Sponsored by: Isilon Systems, Inc.
on an unlinked file. We can't know if this is the case until after we
have the lock.
- Lock the vnode in vn_close, many filesystems had code which was unsafe
without the lock held, and holding it greatly simplifies vgone().
- Adjust vn_lock() to check for the VI_DOOMED flag where appropriate.
Sponsored by: Isilon Systems, Inc.
vn_extattr_rm. This is meant to catch conditions where IO_NODELOCKED
has been specified without the vnode being locked.
Discussed with: rwatson
MFC after: 1 week
- Protect access to mnt_kern_flag with the mountpoint mutex.
- Use the appropriate nd flags to deal with giant in vn_open_cred().
We currently determine whether the caller is mpsafe by checking
for a valid fdidx. Any caller coming from user-space is now
mpsafe and supplies a valid fd. No kenrel callers have been
converted to mpsafe, so this check is sufficient for now.
- Use VFS_LOCK_GIANT instead of manual giant acquisition where
appropriate.
Sponsored By: Isilon Systems, Inc.
I'm not sure why a credential was added to these in the first place, it is
not used anywhere and it doesn't make much sense:
The credentials for syncing a file (ability to write to the
file) should be checked at the system call level.
Credentials for syncing one or more filesystems ("none")
should be checked at the system call level as well.
If the filesystem implementation needs a particular credential
to carry out the syncing it would logically have to the
cached mount credential, or a credential cached along with
any delayed write data.
Discussed with: rwatson
Don't grab Giant in the upper syscall/wrapper code
NET_LOCK_GIANT in the socket code (sockets/fifos).
mtx_lock(&Giant) in the vnode code.
mtx_lock(&Giant) in the opencrypto code. (This may actually not be
needed, but better safe than sorry).
Devfs grabs Giant if the driver is marked as needing Giant.