Commit Graph

466 Commits

Author SHA1 Message Date
Wayne Salamon
65ee602e0c Audit the remaining parameters to the extattr system calls. Generate
the audit records for those calls.

Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-07-06 19:33:38 +00:00
Robert Watson
6e79e6f805 Audit command, uid arguments for quotactl().
Audit the mode argument to mkfifo().
Audit the target path passed to symlink().

Submitted by:	wsalamon
Obtained from:	TrustedBSD Project
2006-06-05 13:34:23 +00:00
Jeff Roberson
3bbd6d8ae6 - Release the references acquired by VOP_GETWRITEMOUNT and vfs_getvfs().
Discussed with:	tegge
Tested by:	kris
Sponsored by:	Isilon Systems, Inc.
2006-03-31 03:54:20 +00:00
John Baldwin
861dab08e7 Change vn_open() to honor the MPSAFE flag in the passed in nameidata object
and use that instead of testing fdidx against -1 to determine if it should
release Giant if Giant was locked due to the requested file residing on a
non-MPSAFE VFS.

Discussed with:	jeff
2006-03-28 21:22:08 +00:00
Jeff Roberson
77c79550af - Remove explicit calls to lock and unlock Giant and replace them with
VFS_LOCK_GIANT/VFS_UNLOCK_GIANT calls.  This completely removes Giant
   acquisition in the syscall path for ffs.

Bug fix to kern_fhstatfs from:	Todd Miller <Todd.Miller@sparta.com>
Sponsored by:	Isilon Systems, Inc.
2006-03-21 23:58:37 +00:00
Paul Saab
6308f39da8 use strlcpy in cvtstatfs and copy_statfs instead of bcopy to ensure
the copied strings are properly terminated.

bzero the statfs32 struct in copy_statfs.
2006-03-04 00:09:09 +00:00
Paul Saab
6815739e00 Don't truncate f_mntfromname & f_mntonname to 16 characters when
translating statfs into ostatfs.  This allows 4.x binaries making
statfs calls to work on 6.x.
2006-03-03 07:20:54 +00:00
Jeff Roberson
8febcfb92f - Use vfs_ref/rel to protect a mountpoint from going away while VFS_STATFS
is being called.  Be sure to grab the ref before we unlock the vnode to
   prevent the mount from disappearing.

Tested by:	kris
2006-02-23 05:18:07 +00:00
Wayne Salamon
bc5504b942 Add pathname and/or vnode argument auditing for the following system calls:
quotactl, statfs, fstatfs, fchdir, chdir, chroot, open, mknod, mkfifo,
link, symlink, undelete, unlink, access, eaccess, stat, lstat, pathconf,
readlink, chflags, lchflags, fchflags, chmod, lchmod, fchmod, chown,
lchown, fchown, utimes, lutimes, futimes, truncate, ftruncate, fsync,
rename, mkdir, rmdir, getdirentries, revoke, lgetfh, getfh, extattrctl,
extattr_set_file, extattr_set_link, extattr_get_file, extattr_get_link,
extattr_delete_file, extattr_delete_link, extattr_list_file, extattr_list_link.

In many cases the pathname and vnode auditing is done within namei lookup
instead of directly in the system call.

Audit the remaining arguments to these system calls:
fstatfs, fchdir, open, mknod, chflags, lchflags, fchflags, chmod, lchmod,
fchmod, chown, lchown, fchown, futimes, ftruncate, fsync, mkdir,
getdirentries.
2006-02-22 16:04:20 +00:00
Jeff Roberson
c5dcb84008 - Revert r1.406 until a solution can be found that doesn't break nfs. The
statfs handler in nfs will lock vnodes which may lead to deadlock or
   recursion.

Found by:	kris
Pointy hat to:	me
2006-02-22 09:52:25 +00:00
Jeff Roberson
05b6a20a66 - Hold the vnode used in the statfs related functions until we're done with
the VFS_STATFS call to prevent the mount from disappearing while we're
   stating.
 - Convert these routines to use MPSAFE namei semantics.

MFC After:	1 week
2006-02-22 06:19:08 +00:00
John Baldwin
809f984b21 Add a kern_eaccess() function and use it to implement xenix_eaccess()
rather than kern_access().

Suggested by:	rwatson
2006-02-06 22:00:53 +00:00
Jeff Roberson
2f0bca553a - Don't check v_mount for NULL to determine if a vnode has been recycled.
Use the more appropriate VI_DOOMED flag instead.

Sponsored by:	Isilon Systems, Inc.
MFC After:	1 week
2006-02-06 10:15:27 +00:00
Robert Watson
59428b0bad In fchdir(), Giant must be separately acquired and dropped if the old
vnode is from a file system that is not MPSAFE, as vrele() expects
Giant to be held when it is called on a non-MPSAFE vnode.

Spotted by:	kris
Tested by:	glebius
2006-02-03 15:42:16 +00:00
Jeff Roberson
0ac72424f0 - chroot and chdir need to lock giant as appropriate for the outgoing vp
as well as the new vp.

Sponsored by:	Isilon Systems, Inc.
MFC After:	3 days
2006-02-01 09:30:44 +00:00
Jeff Roberson
89b0e10910 - Reorder calls to vrele() after calls to vput() when the vrele is a
directory.  vrele() may lock the passed vnode, which in these cases would
   give an invalid lock order of child -> parent.  These situations are
   deadlock prone although do not typically deadlock because the vrele
   is typically not releasing the last reference to the vnode.  Users of
   vrele must consider it as a call to vn_lock() and order it appropriately.

MFC After: 	1 week
Sponsored by:	Isilon Systems, Inc.
Tested by:	kkenn
2006-02-01 00:25:26 +00:00
Don Lewis
1dd5fc0fde Tweak previous vfs_lookup.c commit to return an EINVAL error from
lookup() instead of EPERM when a DELETE or RENAME operation is
attempted on "..".

In kern_unlink(), remap EINVAL errors returned from namei() to EPERM
to match existing (and POSIX required) behaviour.

Discussed with:	bde
MFC after:	3 days
2006-01-22 19:37:02 +00:00
Diomidis Spinellis
c3d78136c9 Fix style bug.
Prompted by:	bde
2006-01-04 07:50:54 +00:00
Diomidis Spinellis
f8ccc6ceb9 Replace tv_usec normalization with the return of EINVAL.
This addresses two objections to the previous behavior,
and unbreaks the alpha tinderbox build.

TODO: update the utimes(2) man page.
2006-01-04 00:47:13 +00:00
Diomidis Spinellis
51339e8593 Normalize the tv_usec part of the utimes(2) arguments to ensure
that a file's atime and mtime are only set to correct fractional
second values (0-999999000ns with the current interface).
Prior to this change users could create files with values outside
that range.  Moreover, on 32-bit machines tv_usec offsets larger than
4.3s would result in an unnormalized AND wrong timestamp value,
due to overflow.

MFC after:	1 week
2006-01-03 21:58:21 +00:00
Pawel Jakub Dawidek
c505fe7a0f Reduce Giant scope a bit, as fdrop() is believed to be MPSAFE.
The purpose of this change is consistency (not performance improvement:)),
as it was hard to tell if fdrop() is MPSAFE or not when I saw it sometimes
under the Giant and sometimes without it.

Glanced at by:	ssouhlal, kan
2005-12-20 00:49:59 +00:00
Christian S.J. Peron
c47a4d1c9f Implement new world order in VFS locking for extended attributes. This will
remove the unconditional acquisition of Giant for extended attribute related
operations. If the file system is set as being MP safe and debug.mpsafevfs is
1, do not pickup Giant.

Mark the following system calls as being MP safe so we no longer pickup Giant
in the system call handler:

o extattrctl
o extattr_set_file
o extattr_get_file
o extattr_delete_file
o extattr_set_fd
o extattr_get_fd
o extattr_delete_fd
o extattr_set_link
o extattr_get_link
o extattr_delete_link
o extattr_list_file
o extattr_list_link
o extattr_list_fd

-Pass MPSAFE flags to namei(9) lookup and introduce vfslocked variable which
 will keep track of any Giant acquisitions.
-Wrap any fd operations which manipulate vnodes in VFS_{UN}LOCK_GIANT
-Drop VFS_ASSERT_GIANT into function which operate on vnodes to ensure that
 we are sufficiently protected.

I've tested these changes with various TrustedBSD MAC policies which use
extended attribute a lot on SMP and UP systems (thanks to Scott Long for
making some SMP hardware available to me for testing).

Discussed with:	jeff
Requested by:	jhb, rwatson
2005-09-24 23:47:04 +00:00
Christian S.J. Peron
68ff2a4397 Improve the MP safeness associated with the creation of symbolic
links and the execution of ELF binaries. Two problems were found:

1) The link path wasn't tagged as being MP safe and thus was not properly
   protected.
2) The ELF interpreter vnode wasnt being locked in namei(9) and thus was
   insufficiently protected.

This commit makes the following changes:

-Sets the MPSAFE flag in NDINIT for symbolic link paths
-Sets the MPSAFE flag in NDINIT and introduce a vfslocked variable which
 will be used to instruct VFS_UNLOCK_GIANT to unlock Giant if it has been
 picked up.
-Drop in an assertion into vfs_lookup which ensures that if the MPSAFE
 flag is NOT set, that we have picked up giant. If not panic (if WITNESS
 compiled into the kernel). This should help us find conditions where vnode
 operations are in-sufficiently protected.

This is a RELENG_6 candidate.

Discussed with:	jeff
MFC after:	4 days
2005-09-15 15:03:48 +00:00
Pawel Jakub Dawidek
d8b464e51e In case of mac_check_vnode_rename_from() or vn_start_write() failure,
vn_finished_write() should not be called.

Reviewed by:	ssouhlal
MFC after:	3 days
2005-09-01 21:46:33 +00:00
Pawel Jakub Dawidek
06a137780b Actually only protect mount-point if security.jail.enforce_statfs is set to 2.
If we don't return statistics about requested file systems, system tools
may not work correctly or at all.

Approved by:	re (scottl)
2005-06-23 22:13:29 +00:00
Jeff Roberson
dbb3ec5ce3 - Remove vnode lock asserts at the end of vfs syscalls. These asserts were
used to ensure that we weren't exiting the syscall with a lock still
   held.  This wasn't safe, however, because we'd already executed a vput()
   and on a loaded system the vnode may have been free'd by the time we
   assert.  This functionality is also handled by the td_locks assert in
   userret, which doesn't tell you what the syscall was, but will at least
   panic before you deadlock.

Sponsored by:   Isilon Systems, Inc.
Discovred by:   Peter Holm
Approved by:	re (blanket vfs)
2005-06-14 01:14:40 +00:00
Pawel Jakub Dawidek
65ac438c8f Do not allocate memory while holding a mutex.
I introduce a very small race here (some file system can be mounted or
unmounted between 'count' calculation and file systems list creation),
but it is harmless.

Found by:	FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/
Reported by:	Peter Holm <peter@holm.cc>
2005-06-12 07:03:23 +00:00
Pawel Jakub Dawidek
3a996d6e91 Do not allocate memory based on not-checked argument from userland.
It can be used to panic the kernel by giving too big value.
Fix it by moving allocation and size verification into kern_getfsstat().
This even simplifies kern_getfsstat() consumers, but destroys symmetry -
memory is allocated inside kern_getfsstat(), but has to be freed by the
caller.

Found by:	FreeBSD Kernel Stress Test Suite: http://www.holm.cc/stress/
Reported by:	Peter Holm <peter@holm.cc>
2005-06-11 14:58:20 +00:00
Pawel Jakub Dawidek
820a0de9a9 Rename sysctl security.jail.getfsstatroot_only to security.jail.enforce_statfs
and extend its functionality:

value	policy
0	show all mount-points without any restrictions
1	show only mount-points below jail's chroot and show only part of the
	mount-point's path (if jail's chroot directory is /jails/foo and
	mount-point is /jails/foo/usr/home only /usr/home will be shown)
2	show only mount-point where jail's chroot directory is placed.

Default value is 2.

Discussed with:	rwatson
2005-06-09 18:49:19 +00:00
Pawel Jakub Dawidek
13a82b9623 Avoid code duplication in serval places by introducing universal
kern_getfsstat() function.

Obtained from:	jhb
2005-06-09 17:44:46 +00:00
Craig Rodrigues
1209e08faf Initialize uio_iovcnt to 1 in extattr_list_vp() and extattr_get_vp()
PR:		kern/79357
Approved by:	rwatson
2005-06-08 13:22:10 +00:00
Robert Watson
f8e5f64207 Acquire Giant explicitly in quotactl() so that the syscalls.master
entry can become MSTD.
2005-05-28 13:11:35 +00:00
Robert Watson
f73e1f57cf Acquire Giant explicitly in fhopen(), fhstat(), and kern_fhstatfs(),
so that we can start to eliminate the presence of non-MPSAFE system
call entries in syscalls.master.
2005-05-28 12:58:54 +00:00
Pawel Jakub Dawidek
9e9cfdca7a Remove (now) unused argument 'td' from cvtstatfs(). 2005-05-27 19:23:48 +00:00
Pawel Jakub Dawidek
2b19a055ab Sync locking in freebsd4_getfsstat() with getfsstat().
Giant is probably also needed in kern_fhstatfs().
2005-05-27 19:21:08 +00:00
Pawel Jakub Dawidek
fd91686850 Use consistent style in functions I want to modify in the near future. 2005-05-27 19:15:46 +00:00
Pawel Jakub Dawidek
c95cbf7ec7 Protect fsid in freebsd4_getfsstat() in simlar way as it is done in
getfsstat().
2005-05-22 23:05:27 +00:00
Pawel Jakub Dawidek
a0e96a49df If we need to hide fsid, kern_statfs()/kern_fstatfs() will do it for us,
so do not duplicate the code in cvtstatfs().
Note, that we now need to clear fsid in freebsd4_getfsstat().

This moves all security related checks from functions like cvtstatfs()
and will allow to add more security related stuff (like statfs(2), etc.
protection for jails) a bit easier.
2005-05-22 21:52:30 +00:00
Jeff Roberson
836c5b4149 - vput(tvp) before vrele(tdvp) in kern_rename() to avoid lock order issues. 2005-04-11 09:19:08 +00:00
Jeff Roberson
5ef9827cea - Remove the namei NOOBJ flag. It is meaningless now.
Sponsored by:	Isilon Systems, Inc.
2005-04-09 12:04:36 +00:00
Jeff Roberson
d830f82824 - Pass LK_EXCLUSIVE to VFS_ROOT() to satisfy the new flags argument. For
now, all calls to VFS_ROOT() should still acquire exclusive locks.

Sponsored by:	Isilon Systems, Inc.
2005-03-24 07:31:38 +00:00
Jeff Roberson
ae88db8a72 - Remove the #ifdef LOOKUP_SHARED from some calls to NDINIT. The
LOCKSHARED flag is simply ignored in namei() if LOOKUP_SHARED is not
   enabled.

Sponsored by:	Isilon Systems, Inc.
2005-03-24 06:03:31 +00:00
Jeff Roberson
9331fd135b - Don't VOP_UNLOCK prior to VOP_REVOKE. The lock is required now.
Sponsored by:	Isilon Systems, Inc.
2005-03-13 11:45:51 +00:00
Poul-Henning Kamp
88e5b12a20 Drag another softupdates tentacle back into FFS: Now that FFS's
vop_fsync is separate from the internal use we can do the full job
there.
2005-02-08 18:09:11 +00:00
John Baldwin
fee4a6af3a Implement a kern_pathconf() wrapper for pathconf() which can take the
filename from either a user space or a kernel space pointer.
2005-02-07 21:46:43 +00:00
John Baldwin
76951d21d1 - Tweak kern_msgctl() to return a copy of the requested message queue id
structure in the struct pointed to by the 3rd argument for IPC_STAT and
  get rid of the 4th argument.  The old way returned a pointer into the
  kernel array that the calling function would then access afterwards
  without holding the appropriate locks and doing non-lock-safe things like
  copyout() with the data anyways.  This change removes that unsafeness and
  resulting race conditions as well as simplifying the interface.
- Implement kern_foo wrappers for stat(), lstat(), fstat(), statfs(),
  fstatfs(), and fhstatfs().  Use these wrappers to cut out a lot of
  code duplication for freebsd4 and netbsd compatability system calls.
- Add a new lookup function kern_alternate_path() that looks up a filename
  under an alternate prefix and determines which filename should be used.
  This is basically a more general version of linux_emul_convpath() that
  can be shared by all the ABIs thus allowing for further reduction of
  code duplication.
2005-02-07 18:44:55 +00:00
Jeff Roberson
ff05fd5d77 - Correct a typo in kern_rename. tvfslocked should be initialized from
tond and not fromnd.  This could lead us to leak Giant, or unlock it
   twice, depending on the filesystems involved.  renames within a single
   filesystem would not have caused any problems.

Sponsored by:	Isilon Systems, Inc.
2005-02-02 17:17:15 +00:00
Jeff Roberson
37c15216fc - Or MPSAFE with the correct set of flags in stat(). This affected only
the LOOKUP_SHARED case.

Spotted by:	jhb
2005-02-01 23:43:46 +00:00
Poul-Henning Kamp
8516dd18e1 Don't use VOP_GETVOBJECT, use vp->v_object directly. 2005-01-25 00:40:01 +00:00
Poul-Henning Kamp
dcff5b1440 Don't call VOP_CREATEVOBJECT(), it's the responsibility of the
filesystem which owns the vnode.
2005-01-24 23:53:54 +00:00
Jeff Roberson
94a9458501 - Change all vfs syscalls to use VFS_LOCK_GIANT(), and MPSAFE nds.
- Move Giant acquisition into the few vfs syscalls that weren't already
   directly acquiring it.

Sponsored By:	Isilon Systems, Inc.
2005-01-24 10:25:44 +00:00
Poul-Henning Kamp
e39db32ab0 Ditch vfs_object_create() and make the callers call VOP_CREATEVOBJECT()
directly.
2005-01-13 12:25:19 +00:00
Poul-Henning Kamp
8df6bac4c7 Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC().
I'm not sure why a credential was added to these in the first place, it is
not used anywhere and it doesn't make much sense:

	The credentials for syncing a file (ability to write to the
	file) should be checked at the system call level.

	Credentials for syncing one or more filesystems ("none")
	should be checked at the system call level as well.

	If the filesystem implementation needs a particular credential
	to carry out the syncing it would logically have to the
	cached mount credential, or a credential cached along with
	any delayed write data.

Discussed with:	rwatson
2005-01-11 07:36:22 +00:00
Warner Losh
9454b2d864 /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 23:35:40 +00:00
Poul-Henning Kamp
9bb4281603 Eliminate pointless goto. 2004-11-16 08:22:06 +00:00
Poul-Henning Kamp
48ab5b2d21 Forgot to remove now unused variable in last commit. 2004-11-15 21:28:00 +00:00
Poul-Henning Kamp
136211e58e It is not necessary to hold vn_start_write/vn_finished_write around VOP_REVOKE. 2004-11-15 21:27:06 +00:00
Poul-Henning Kamp
718fe8e2bf Next FILEDESC_LOCK properly around FILE_LOCK 2004-11-15 21:26:13 +00:00
Poul-Henning Kamp
124e4c3be8 Introduce an alias for FILEDESC_{UN}LOCK() with the suffix _FAST.
Use this in all the places where sleeping with the lock held is not
an issue.

The distinction will become significant once we finalize the exact
lock-type to use for this kind of case.
2004-11-13 11:53:02 +00:00
Poul-Henning Kamp
ef11fbd7c4 Introduce fdclose() which will clean an entry in a filedesc.
Replace homerolled versions with call to fdclose().

Make fdunused() static to kern_descrip.c
2004-11-07 22:16:07 +00:00
Colin Percival
56f21b9d74 Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is
somewhat clearer, but more importantly allows for a consistent naming
scheme for suser_cred flags.

The old name is still defined, but will be removed in a few days (unless I
hear any complaints...)

Discussed with:	rwatson, scottl
Requested by:	jhb
2004-07-26 07:24:04 +00:00
Alfred Perlstein
f257b7a54b Make VFS_ROOT() and vflush() take a thread argument.
This is to allow filesystems to decide based on the passed thread
which vnode to return.
Several filesystems used curthread, they now use the passed thread.
2004-07-12 08:14:09 +00:00
Robert Watson
4f3bf9b9b4 Don't cuddle else's so much as we removed additional parts of each
block.
2004-06-24 17:22:29 +00:00
Robert Watson
5e11031e05 Remove temporary API bandage that allowed applications speaking the
older API to list attributes on a file (zero-length attribute name)
to function.  extattr_list_*() are now the only available APIs to
use when listing attributes.
2004-06-24 17:14:28 +00:00
Robert Watson
9260798fd7 Acquire Giant in link() so that the system call can be marked
MPSAFE.  Don't want to acquire Giant in kern_link() sync linux
compat code performs actions requiring Giant prior to calling
kern_link().
2004-06-22 04:34:05 +00:00
Robert Watson
694b21cf7b Acquire Giant in link() so that we can mark it as MSTD in
syscalls.master.  Don't want to do it in kern_link() since the
Linux emulation code calls kern_link() after performing other
actions requiring Giant.
2004-06-22 04:29:07 +00:00
Poul-Henning Kamp
d7086f313a Only initialize f_data and f_ops if nobody else did so already. 2004-06-19 11:41:45 +00:00
Poul-Henning Kamp
1930e303cf Deorbit COMPAT_SUNOS.
We inherited this from the sparc32 port of BSD4.4-Lite1.  We have neither
a sparc32 port nor a SunOS4.x compatibility desire these days.
2004-06-11 11:16:26 +00:00
Pawel Jakub Dawidek
79db0f1cbf Remove unused code.
Submitted by:	Bjoern A. Zeeb
2004-06-07 12:19:55 +00:00
Tim J. Robbins
c4d85674d5 Remove a stale comment. 2004-06-04 11:00:22 +00:00
Tim J. Robbins
f52e2ef29f Eliminate a memory leak in kern_symlink() that could occur if
vn_start_write() failed.
2004-05-11 10:42:02 +00:00
Pawel Jakub Dawidek
6c0ad4a77a Always use nd.ni_vp->v_mount as an argument for VFS_QUOTACTL(), just like
in RELENG_4.

Pointed out by:	Alex Lyashkov <umka@sevinter.net>
2004-04-26 15:44:42 +00:00
Pawel Jakub Dawidek
0c0c597faa Look out! vn_start_write() is able to return 0 and NULL 'mp'.
Submitted by:	Alex Lyashkov <shadow@psoft.net>
2004-04-22 15:40:27 +00:00
Bruce Evans
295ed75297 Removed some less than useful comments:
- don't say what a small subset of the options includes are for.
- don't mark up functions which use all their args with /* ARGSUSED */.
  The markup should have been removed when the unused retval parameter
  was removed.
- don't comment on what routine suser() checks do.  Removed nearby
  excessive vertical whitespace.
2004-04-06 10:05:02 +00:00
Warner Losh
7f8a436ff2 Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core
2004-04-05 21:03:37 +00:00
Doug Rabson
0b0a60fb43 Add lgetfh(2) which is like getfh(2) but doesn't follow symlinks. 2004-04-05 10:15:53 +00:00
David Malone
31c7e8b05b Nudge Giant as far as I can into kern_open(). Mark open() as MPSAFE.
Use kern_open() to implement creat() rather than taking the long route
through open(). Mark creat as MPSAFE.

While I'm at it, mark nosys() (syscall 0) as MPSAFE, for all the
difference it will make.
2004-03-16 10:46:42 +00:00
Pawel Jakub Dawidek
dd604e2647 Add two new sysctls:
- security.bsd.hardlink_check_uid, when set, means, that unprivileged
		users are not permitted to create hard links to files not
		owned by them,
	- security.bsd.hardlink_check_gid, when set, means, that unprivileged
		users are not permitted to create hard links to files owned
		by group they don't belong to.

OK'ed by:	rwatson
2004-03-08 20:37:25 +00:00
David Malone
a1cc6206fb Correct a comment.
Reviewed by:	alfred, tanimura
2004-02-17 12:30:32 +00:00
Robert Watson
f08df373a3 By default, when a process in jail calls getfsstat(), only return the
data for the file system on which the jail's root vnode is located.
Previous behavior (show data for all mountpoints) can be restored
by setting security.jail.getfsstatroot_only to 0.  Note: this also
has the effect of hiding other mounts inside a jail, such as /dev,
/tmp, and /proc, but errs on the side of leaking less information.
2004-02-14 18:31:11 +00:00
Dag-Erling Smørgrav
a2fe44e8cf New file descriptor allocation code, derived from similar code introduced
in OpenBSD by Niels Provos.  The patch introduces a bitmap of allocated
file descriptors which is used to locate available descriptors when a new
one is needed.  It also moves the task of growing the file descriptor table
out of fdalloc(), reducing complexity in both fdalloc() and do_dup().

Debts of gratitude are owed to tjr@ (who provided the original patch on
which this work is based), grog@ (for the gdb(4) man page) and rwatson@
(for assistance with pxeboot(8)).
2004-01-15 10:15:04 +00:00
Dag-Erling Smørgrav
05c3c5c8b6 Mechanical whitespace cleanup; parenthesize return values; other minor
style nits.  The #ifdefs in this file give me a headache...
2004-01-11 19:52:10 +00:00
Robert Watson
69546b2fbb Document that when we are addressing an open()/close() race, the reason
we call vn_close() manually rather than letting fdrop() take care of it
is that we haven't yet hooked up the various 'struct file' fields.
2003-12-24 17:13:01 +00:00
Kirk McKusick
fde81c7d8e Update the statfs structure with 64-bit fields to allow
accurate reporting of multi-terabyte filesystem sizes.

You should build and boot a new kernel BEFORE doing a `make world'
as the new kernel will know about binaries using the old statfs
structure, but an old kernel will not know about the new system
calls that support the new statfs structure. Running an old kernel
after a `make world' will cause programs such as `df' that do a
statfs system call to fail with a bad system call.

Reviewed by:	Bruce Evans <bde@zeta.org.au>
Reviewed by:	Tim Robbins <tjr@freebsd.org>
Reviewed by:	Julian Elischer <julian@elischer.org>
Reviewed by:	the hoards of <arch@freebsd.org>
Sponsored by:   DARPA & NAI Labs.
2003-11-12 08:01:40 +00:00
David Malone
e1419c08e2 falloc allocates a file structure and adds it to the file descriptor
table, acquiring the necessary locks as it works. It usually returns
two references to the new descriptor: one in the descriptor table
and one via a pointer argument.

As falloc releases the FILEDESC lock before returning, there is a
potential for a process to close the reference in the file descriptor
table before falloc's caller gets to use the file. I don't think this
can happen in practice at the moment, because Giant indirectly protects
closes.

To stop the file being completly closed in this situation, this change
makes falloc set the refcount to two when both references are returned.
This makes life easier for several of falloc's callers, because the
first thing they previously did was grab an extra reference on the
file.

Reviewed by:	iedowse
Idea run past:	jhb
2003-10-19 20:41:07 +00:00
Robert Watson
c096756c00 Add mac_check_vnode_deleteextattr() and mac_check_vnode_listextattr():
explicit access control checks to delete and list extended attributes
on a vnode, rather than implicitly combining with the setextattr and
getextattr checks.  This reflects EA API changes in the kernel made
recently, including the move to explicit VOP's for both of these
operations.

Obtained from:	TrustedBSD PRoject
Sponsored by:	DARPA, Network Associates Laboratories
2003-08-21 13:53:01 +00:00
John Baldwin
b2db7dc658 td_dupfd just needs to be less than 0, it does not have to hold the
negative value of the index of the new file, so just use -1.
2003-08-07 17:08:26 +00:00
Ian Dowse
76bd23557b In the mknod(), mkfifo(), link(), symlink() and undelete() syscalls,
use vrele() instead of vput() on the parent directory vnode returned
by namei() in the case where it is equal to the target vnode. This
handles namei()'s somewhat strange (but documented) behaviour of
not locking either vnode when the two vnodes are equal and LOCKPARENT
but not LOCKLEAF is specified.

Note that since a vnode double-unlock is not currently fatal, these
coding errors were effectively harmless.

Spotted by:	Juergen Hannken-Illjes <hannken@eis.cs.tu-bs.de>
Reviewed by:	mckusick
2003-08-05 00:26:51 +00:00
Robert Watson
9080ff25cf Rename VOP_RMEXTATTR() to VOP_DELETEEXTATTR() for consistency with the
kernel ACL interfaces and system call names.

Break out UFS2 and FFS extattr delete and list vnode operations from
setextattr and getextattr to deleteextattr and listextattr, which
cleans up the implementations, and makes the results more readable,
and makes the APIs more clear.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-07-28 18:53:29 +00:00
Poul-Henning Kamp
cf7742997a Pass the file descriptor index down to vn_open.
If the method vector was replaced and we got the "special return code"
smile and trust that whatever happened below DTRT.
2003-07-27 20:09:13 +00:00
Poul-Henning Kamp
7c89f162bc Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout. 2003-07-27 17:04:56 +00:00
Poul-Henning Kamp
a8d43c90af Add a "int fd" argument to VOP_OPEN() which in the future will
contain the filedescriptor number on opens from userland.

The index is used rather than a "struct file *" since it conveys a bit
more information, which may be useful to in particular fdescfs and /dev/fd/*

For now pass -1 all over the place.
2003-07-26 07:32:23 +00:00
Poul-Henning Kamp
1226914c17 Use the f_vnode field to tell which file descriptors have a vnode. 2003-07-04 12:20:27 +00:00
Robert Watson
6b42f0a2eb Prefer the vop_rmextattr() vnode operation for removing extended
attributes from objects over vop_setextattr() with a NULL uio; if
the file system doesn't support the vop_rmextattr() method, fall
back to the vop_setextattr() method.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-22 23:03:07 +00:00
Poul-Henning Kamp
3b6d965263 Add a f_vnode field to struct file.
Several of the subtypes have an associated vnode which is used for
stuff like the f*() functions.

By giving the vnode a speparate field, a number of checks for the specific
subtype can be replaced simply with a check for f_vnode != NULL, and
we can later free f_data up to subtype specific use.

At this point in time, f_data still points to the vnode, so any code I
might have overlooked will still work.
2003-06-22 08:41:43 +00:00
Poul-Henning Kamp
eaaca5deee Don't (re)initialize f_gcflag to zero.
Move initialization of DTYPE_VNODE specific field f_seqcount into
the DTYPE_VNODE specific code.
2003-06-20 08:02:30 +00:00
Don Lewis
6084b6c9d5 FILE_LOCK() uses a pool mutex, as does the vnode v_vnlock. Since pool
mutexes are supposed to only be used as leaf mutexes, and what appear
to be separate pool mutexes could be aliased together, it is bad idea
for a thread to attempt to hold two pool mutexes at the same time.

Slightly rearrange the code in kern_open() so that FILE_UNLOCK() is
called before calling VOP_GETVOBJECT(), which will grab the v_vnlock
mutex.
2003-06-19 04:10:56 +00:00
Poul-Henning Kamp
2db4b023bb Introduce a new flag on a file descriptor: DFLAG_SEEKABLE and use that
rather than assume that only DTYPE_VNODE is seekable.
2003-06-18 19:53:59 +00:00
David E. O'Brien
677b542ea2 Use __FBSDID(). 2003-06-11 00:56:59 +00:00
Robert Watson
777621799b If a system call comes in requesting to retrieve an attribute named
"", temporarily map it to a call to extattr_list_vp() to provide
compatibility for older applications using the "" API to retrieve
EA lists.

Use VOP_LISTEXTATTR() to support extattr_list_vp() rather than
VOP_GETEXTATTR(..., "", ...).

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Asssociates Laboratories
2003-06-05 05:55:34 +00:00