Commit Graph

1098 Commits

Author SHA1 Message Date
Jeff Roberson
cfd5600c66 - Several of the callers to getdirtybuf() were erroneously changed to pass
in a list head instead of a pointer to the first element at the time of
   the first call.  These lists are subject to change, and getdirtybuf()
   would refetch from the wrong list in some cases.

Spottedy by:	tegge
Pointy hat to:	me
2003-09-03 04:08:15 +00:00
Jeff Roberson
23efe6dafc - Backout rev 1.142. This caused a deadlock that I do not understand. More
investigation is required.
2003-08-31 11:26:52 +00:00
Jeff Roberson
d919a11d06 - Define a new flag for getblk(): GB_NOCREAT. This flag causes getblk() to
bail out if the buffer is not already present.
 - The buffer returned by incore() is not locked and should not be sent to
   brelse().  Use getblk() with the new GB_NOCREAT flag to preserve the
   desired semantics.
2003-08-31 08:50:11 +00:00
Jeff Roberson
a0ebaaddef - Don't acquire the vnode interlock in drain_output(). Instead, require the
caller to acquire it.  This permits drain_output() to be done atomically
   with other operations as well as reducing the number of lock operations.
 - Assert that the proper locks are held in drain_output().
 - Change getdirtybuf() to accept a mutex as an argument.  This mutex is used
   to protect the vnode's buf list and the BKGRDWAIT flag.  This lock is
   dropped when we successfully acquire a buffer and held on return
   otherwise.  These semantics reduce the number of cumbersome cases in
   calling code.
 - Pass the mtx from getdirtybuf() into interlocked_sleep() and allow this
   mutex to be used as the interlock argument to BUF_LOCK() in the LOCKBUF
   case of interlocked_sleep().
 - Change the return value of getdirtybuf() to be the resulting locked buffer
   or NULL otherwise.  This is for callers who pass in a list head that
   requires a lock.  It is necessary since the lock that protects the list
   head must be dropped in getdirtybuf() so that we don't have a lock order
   reversal with the buf queues lock in bremfree().
 - Adjust all callers of getdirtybuf() to match the new semantics.
 - Add a comment in indir_trunc() that points at unlocked access to a buf.
   This may also be one of the last instances of incore() in the tree.
2003-08-31 07:29:34 +00:00
Jeff Roberson
9dbfeb0ae6 - Move BX_BKGRDWAIT and BX_BKGRDINPROG to BV_ and the b_vflags field.
- Surround all accesses of the BKGRD{WAIT,INPROG} flags with the vnode
   interlock.
 - Don't use the B_LOCKED flag and QUEUE_LOCKED for background write
   buffers.  Check for the BKGRDINPROG flag before recycling or throwing
   away a buffer.  We do this instead because it is not safe for us to move
   the original buffer to a new queue from the callback on the background
   write buffer.
 - Remove the B_LOCKED flag and the locked buffer queue.  They are no longer
   used.
 - The vnode interlock is used around checks for BKGRDINPROG where it may
   not be strictly necessary.  If we hold the buf lock the a back-ground
   write will not be started without our knowledge, one may only be
   completed while we're not looking.  Rather than remove the code, Document
   two of the places where this extra locking is done.  A pass should be
   done to verify and minimize the locking later.
2003-08-28 06:55:18 +00:00
Alan Cox
9cf8f2f707 The previous change necessitates the addition of a new #include. Otherwise,
there is a compilation warning.
2003-08-18 17:27:08 +00:00
Poul-Henning Kamp
b103854847 Don't use a VOP_*() function on our own vnodes, go directly to the
relevant internal function, in this case ufs_bmaparray().
2003-08-17 19:26:03 +00:00
Alan Cox
f6c098e569 Revision 1.44 of ufs/ufs/inode.h has made it necessary to add two new
#includes to this file.  Otherwise, it doesn't compile.
2003-08-16 06:15:17 +00:00
Poul-Henning Kamp
5c24d6ee26 Eliminate the i_devvp field from the incore UFS inodes, we can
get the same value from ip->i_ump->um_devvp.

This saves a pointer in the memory copies of inodes, which can
easily run into several hundred kilobytes.

The extra indirection is unmeasurable in benchmarks.

Approved by:	mckusick
2003-08-15 20:03:19 +00:00
John Baldwin
8b149b5131 Consistently use the BSD u_int and u_short instead of the SYSV uint and
ushort.  In most of these files, there was a mixture of both styles and
this change just makes them self-consistent.

Requested by:	bde (kern_ktrace.c)
2003-08-07 15:04:27 +00:00
Robert Watson
2495048579 Now that the central POSIX.1e ACL code implements functions to
generate the inode mode from a default ACL and creation mask,
implement ufs_sync_inode_from_acl() using acl_posix1e_newfilemode().

Since ACL_OVERRIDE_MASK/ACL_PRESERVE_MASK are defined, we no
longer need to explicitly pass in a "preserve_mask" field: this
is implicit in the use of POSIX.1e semantics.

Note: this change contains a semantic bugfix for new file creation:
we now intersect the ACL-generated mode and the cmode requested by
the user process.  This means permissions on newly created file
objects will now be more conservative.  In the future, we may want
to provide alternative semantics (similar to Solaris and Linux) in
which the ACL mask overrides the umask, permitting ACLs to broaden
the rights beyond the requested umask.

PR:		50148
Reported by:	Ritz, Bruno <bruno_ritz@gmx.ch>
Obtained from:	TrustedBSD Project
2003-08-04 03:29:13 +00:00
Robert Watson
7942b925b8 In ufs_chmod(), use privilege only when required in the following
cases:

- Setting sticky bit on non-directory
- Setting setgid on a file with a group that isn't in the effective
  or extended groups of the authorizing credential

I.e., test the requirement first, then do the privilege test,
rather than doing the privilege test regardless of the need for
privilege.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-08-04 00:31:01 +00:00
Robert Watson
9080ff25cf Rename VOP_RMEXTATTR() to VOP_DELETEEXTATTR() for consistency with the
kernel ACL interfaces and system call names.

Break out UFS2 and FFS extattr delete and list vnode operations from
setextattr and getextattr to deleteextattr and listextattr, which
cleans up the implementations, and makes the results more readable,
and makes the APIs more clear.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-07-28 18:53:29 +00:00
Poul-Henning Kamp
7c89f162bc Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout. 2003-07-27 17:04:56 +00:00
Poul-Henning Kamp
a8d43c90af Add a "int fd" argument to VOP_OPEN() which in the future will
contain the filedescriptor number on opens from userland.

The index is used rather than a "struct file *" since it conveys a bit
more information, which may be useful to in particular fdescfs and /dev/fd/*

For now pass -1 all over the place.
2003-07-26 07:32:23 +00:00
Poul-Henning Kamp
b941a2beb7 We just cached the inode pointer, no need to call VTOI() again. 2003-07-04 12:16:33 +00:00
Alan Cox
4e28b22e35 Lock the vm object when freeing pages. 2003-06-15 21:50:38 +00:00
Poul-Henning Kamp
cefb5754dd Add the same KASSERT to all VOP_STRATEGY and VOP_SPECSTRATEGY implementations
to check that the buffer points to the correct vnode.
2003-06-15 18:53:00 +00:00
Robert Watson
44533b1722 Re-implement kernel access control for quotactl() as found in the
UFS quota implementation.  Push some quite broken access control
logic out of ufs_quotactl() into the individual command
implementations in ufs_quota.c; fix that logic.  Pass in the thread
argument to any quotactl command that will need to perform access
control.

o quotaon() requires privilege (PRISON_ROOT).

o quotaoff() requires privilege (PRISON_ROOT).

o getquota() requires that:

    If the type is USRQUOTA, either the effective uid match the
    requested quota ID, that the unprivileged_get_quota flag be
    set, or that the thread be privileged (PRISON_ROOT).

    If the type is GRPQUOTA, require that either the thread be
    a member of the group represented by the requested quota ID,
    that the unprivileged_get_quota flag be set, or that the
    thread be privileged (PRISON_ROOT).

o setquota() requires privilege (PRISON_ROOT).

o setuse() requires privilege (PRISON_ROOT).

o qsync() requires no special privilege (consistent with what
  was present before, but probably not very useful).

Add a new sysctl, security.bsd.unprivileged_get_quota, which when
set to a non-zero value, will permit unprivileged users to query user
quotas with non-matching uids and gids.  Set this to 0 by default
to be mostly consistent with the previous behavior (the same for
USRQUOTA, but not for GRPQUOTA).

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-15 06:36:19 +00:00
Poul-Henning Kamp
7652131bee Initialize struct vfsops C99-sparsely.
Submitted by:   hmp
Reviewed by:	phk
2003-06-12 20:48:38 +00:00
David E. O'Brien
f4636c5959 Use __FBSDID(). 2003-06-11 06:34:30 +00:00
Robert Watson
1e9e2eb598 Implement ffs_listextattr() by breaking out that logic and special-cased
attribute name of "" from ffs_getextattr().  Invoking VOP_GETETATTR()
with an empty name is now no longer supported; user application
compatibility is provided by a system call level compatibility
wrapper.  We make sure to explicitly reject attempts to set an EA
with the name "".

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-05 05:57:39 +00:00
Robert Watson
bd38ab57a1 Don't special-case handling of the empty string in the UFS1
extended attribute retrieval code: it's no longer special-cased,
and is caught by the normal UFS1 EA validity checks (and, in
fact, returns the same error, EINVAL).

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-05 04:58:58 +00:00
Robert Watson
e1249def7d Return EOPNOTSUPP for attempted EA operations on VCHR vnodes in UFS2;
if we permit them to occur, the kernel panics due to our performing
EA operations using VOP_STRATEGY on the vnode.  This went unnoticed
previously because there are very for users of device nodes on UFS2
due to the introduction of devfs.  However, this can come up with
the Linux compat directories and its hard-coded dev nodes (which will
need to go away as we move away from hard-coded device numbers).
This can come up if you use EA-intensive features such as ACLs and
MAC.

The proper fix is pretty complicated, but this band-aid would be
an excellent MFC candidate for the release.
2003-06-01 02:42:18 +00:00
Poul-Henning Kamp
61301f74d0 Remove unused variable.
Found by:       FlexeLint
2003-05-31 19:56:09 +00:00
Poul-Henning Kamp
6280ed26af Remove unused local variables.
Found by:       FlexeLint
2003-05-31 18:17:32 +00:00
Poul-Henning Kamp
17a1391990 The IO_NOWDRAIN and B_NOWDRAIN hacks are no longer needed to prevent
deadlocks with vnode backed md(4) devices because md now uses a
kthread to run the bio requests instead of doing it directly from
the bio down path.
2003-05-31 16:42:45 +00:00
Alan Cox
7f758dabbb Lock the vm object when performing vm_object_page_clean().
Approved by:	re (rwatson)
2003-05-18 22:02:51 +00:00
Robert Watson
62d4b85ec1 Jeff added locking assertions that the VV_ flags on vnodes were modified
only while holding appropriate vnode locks.  This patch slides the lock
release for ufs_extattr_enable() to continue to hold the active vnode lock
on a backing file until after the flag change; it also acquires a vnode
lock when disabling an attribute and hence clearing a flag on the backing
vnode.  This permits VFS_DEBUG_LOCKS to run UFS1 extended attributes
without panicking, as well as preventing a potential race and vnode flag
problem.

Approved by:	re (jhb)
Pointed out by:	DEBUG_VFS_LOCKS
2003-05-15 21:07:33 +00:00
Alan Cox
ad682c4825 Lock the vm_object on entry to vm_object_vndeallocate(). 2003-05-03 20:28:26 +00:00
Tim J. Robbins
3632928957 Do not attempt to free NULL dinodes (i_din1 or i_din2) in ffs_ifree().
These fields can be left as NULL if ffs_vget() allocates an inode but
fails before the dinode memory has been allocated. There are two cases
when this can occur: when we lose a race and another process has added
the inode to the hash, and when reading the inode off disk fails.

The bug was observed by Kris on one of the package-building machines.
See http://marc.theaimsgroup.com/?l=freebsd-current&m=105172731013411&w=2
In Kris's case, it was the bread() that failed because of a disk error.

The alternative to this patch is to ensure that ffs_vget() does not call
vput() when the inode that hasn't been properly initialised.
2003-05-01 06:41:59 +00:00
Tim J. Robbins
8d721e877d Free i_din2 instead of i_din1 in ffs_ifree() on UFS2 filesystems.
This is purely a cosmetic change because these members are in a
union together.
2003-05-01 06:38:27 +00:00
Mark Murray
51da11a27a Fix some easy, global, lint warnings. In most cases, this means
making some local variables static. In a couple of cases, this means
removing an unused variable.
2003-04-30 12:57:40 +00:00
Alexander Kabaev
104a9b7e3e Deprecate machine/limits.h in favor of new sys/limits.h.
Change all in-tree consumers to include <sys/limits.h>

Discussed on:	standards@
Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
2003-04-29 13:36:06 +00:00
John Baldwin
a15cc35909 Lock both the proc lock and sched_lock when calling sched_nice since
kg_nice is now protected by both.  Being protected by both means that
other places in the kernel that want to read kg_nice only need one of the
two locks.
2003-04-22 20:45:38 +00:00
Jeff Roberson
86711bae9b - Use the sched_nice() api instead of setting the nice value directly.
Tested by:	Steve Kargl <sgk@troutmask.apl.washington.edu>
2003-04-12 01:05:19 +00:00
Alan Cox
6134838f99 Sufficient access checks are performed by vmapbuf() that calling useracc()
is pointless.  Remove the call to useracc().

Don't reinitialize fields that are already initialized by getpbuf().

Reviewed by:	tegge
2003-04-06 19:26:30 +00:00
Tor Egge
5e2e6a67c4 Check return value from vmapbuf instead of the function address. 2003-03-27 20:48:34 +00:00
Tor Egge
10dccf8ff2 Eliminate a buffer sleep/wakeup race. 2003-03-27 19:28:11 +00:00
Tor Egge
5bbb806004 Add support for reading directly from file to userland buffer when the
O_DIRECT descriptor status flag is set and both offset and length is a
multiple of the physical media sector size.
2003-03-26 23:40:42 +00:00
John Baldwin
31566c96f4 Use td->td_ucred instead of td->td_proc->p_ucred. 2003-03-20 21:17:40 +00:00
John Baldwin
2a53bfbe62 Minor fixes to ffs_fserr():
- Assume that curthread is not NULL.  It never is in -current.
- Use td_ucred instead of p_ucred.
2003-03-20 21:15:54 +00:00
Poul-Henning Kamp
b4b138c27f Including <sys/stdint.h> is (almost?) universally only to be able to use
%j in printfs, so put a newsted include in <sys/systm.h> where the printf
prototype lives and save everybody else the trouble.
2003-03-18 08:45:25 +00:00
Jeff Roberson
09f11da5a3 - Remove a race between fsync like functions and flushbufqueues() by
requiring locked bufs in vfs_bio_awrite().  Previously the buf could
   have been written out by fsync before we acquired the buf lock if it
   weren't for giant.  The cluster_wbuild() handles this race properly but
   the single write at the end of vfs_bio_awrite() would not.
 - Modify flushbufqueues() so there is only one copy of the loop.  Pass a
   parameter in that says whether or not we should sync bufs with deps.
 - Call flushbufqueues() a second time and then break if we couldn't find
   any bufs without deps.
2003-03-13 07:19:23 +00:00
Kirk McKusick
34968037b1 Use the appropriate size when zeroing out the unused portion
of a snapshot's copy of a superblock. This patch fixes a panic
when taking a snapshot of a 4096/512 filesystem.

Reported by:	Ian Freislich <ianf@za.uu.net>
Sponsored by:   DARPA & NAI Labs.
2003-03-07 23:49:16 +00:00
Alan Cox
09c80124a3 Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress.
Discussed on:	arch@
2003-03-06 03:41:02 +00:00
Jeff Roberson
7261f5f68e - Add a new 'flags' parameter to getblk().
- Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT
   flag to the initial BUF_LOCK().  This will eventually be used in cases
   were we want to use a buffer only if it is not currently in use.
 - Convert all consumers of the getblk() api to use this extra parameter.

Reviwed by:	arch
Not objected to by:	mckusick
2003-03-04 00:04:44 +00:00
Nate Lawson
99648386d3 Finish cleanup of vprint() which was begun with changing v_tag to a string.
Remove extraneous uses of vop_null, instead defering to the default op.
Rename vnode type "vfs" to the more descriptive "syncer".
Fix formatting for various filesystems that use vop_print.
2003-03-03 19:15:40 +00:00
Dag-Erling Smørgrav
521f364b80 More low-hanging fruit: kill caddr_t in calls to wakeup(9) / [mt]sleep(9). 2003-03-02 16:54:40 +00:00
Kirk McKusick
74f3809a19 Change the field used to test whether the superblock has been updated
from the filesystem size field to the filesystem maximum blocksize
field. The problem is that older versions of growfs updated only the
new size field and not the old size field. This resulted in the old
(smaller) size field being copied up to the new size field which
caused the filesystem to appear to fsck to be badly trashed.

This also adds a sanity check to ensure that the superblock is not
being updated when the filesystem is mounted read-only. Obviously
such an update should never happen.

Reported by:	Nate Lawson <nate@root.org>
Sponsored by:   DARPA & NAI Labs.
2003-02-25 23:21:08 +00:00