Commit Graph

1090 Commits

Author SHA1 Message Date
Poul-Henning Kamp
5c24d6ee26 Eliminate the i_devvp field from the incore UFS inodes, we can
get the same value from ip->i_ump->um_devvp.

This saves a pointer in the memory copies of inodes, which can
easily run into several hundred kilobytes.

The extra indirection is unmeasurable in benchmarks.

Approved by:	mckusick
2003-08-15 20:03:19 +00:00
John Baldwin
8b149b5131 Consistently use the BSD u_int and u_short instead of the SYSV uint and
ushort.  In most of these files, there was a mixture of both styles and
this change just makes them self-consistent.

Requested by:	bde (kern_ktrace.c)
2003-08-07 15:04:27 +00:00
Robert Watson
2495048579 Now that the central POSIX.1e ACL code implements functions to
generate the inode mode from a default ACL and creation mask,
implement ufs_sync_inode_from_acl() using acl_posix1e_newfilemode().

Since ACL_OVERRIDE_MASK/ACL_PRESERVE_MASK are defined, we no
longer need to explicitly pass in a "preserve_mask" field: this
is implicit in the use of POSIX.1e semantics.

Note: this change contains a semantic bugfix for new file creation:
we now intersect the ACL-generated mode and the cmode requested by
the user process.  This means permissions on newly created file
objects will now be more conservative.  In the future, we may want
to provide alternative semantics (similar to Solaris and Linux) in
which the ACL mask overrides the umask, permitting ACLs to broaden
the rights beyond the requested umask.

PR:		50148
Reported by:	Ritz, Bruno <bruno_ritz@gmx.ch>
Obtained from:	TrustedBSD Project
2003-08-04 03:29:13 +00:00
Robert Watson
7942b925b8 In ufs_chmod(), use privilege only when required in the following
cases:

- Setting sticky bit on non-directory
- Setting setgid on a file with a group that isn't in the effective
  or extended groups of the authorizing credential

I.e., test the requirement first, then do the privilege test,
rather than doing the privilege test regardless of the need for
privilege.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-08-04 00:31:01 +00:00
Robert Watson
9080ff25cf Rename VOP_RMEXTATTR() to VOP_DELETEEXTATTR() for consistency with the
kernel ACL interfaces and system call names.

Break out UFS2 and FFS extattr delete and list vnode operations from
setextattr and getextattr to deleteextattr and listextattr, which
cleans up the implementations, and makes the results more readable,
and makes the APIs more clear.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-07-28 18:53:29 +00:00
Poul-Henning Kamp
7c89f162bc Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout. 2003-07-27 17:04:56 +00:00
Poul-Henning Kamp
a8d43c90af Add a "int fd" argument to VOP_OPEN() which in the future will
contain the filedescriptor number on opens from userland.

The index is used rather than a "struct file *" since it conveys a bit
more information, which may be useful to in particular fdescfs and /dev/fd/*

For now pass -1 all over the place.
2003-07-26 07:32:23 +00:00
Poul-Henning Kamp
b941a2beb7 We just cached the inode pointer, no need to call VTOI() again. 2003-07-04 12:16:33 +00:00
Alan Cox
4e28b22e35 Lock the vm object when freeing pages. 2003-06-15 21:50:38 +00:00
Poul-Henning Kamp
cefb5754dd Add the same KASSERT to all VOP_STRATEGY and VOP_SPECSTRATEGY implementations
to check that the buffer points to the correct vnode.
2003-06-15 18:53:00 +00:00
Robert Watson
44533b1722 Re-implement kernel access control for quotactl() as found in the
UFS quota implementation.  Push some quite broken access control
logic out of ufs_quotactl() into the individual command
implementations in ufs_quota.c; fix that logic.  Pass in the thread
argument to any quotactl command that will need to perform access
control.

o quotaon() requires privilege (PRISON_ROOT).

o quotaoff() requires privilege (PRISON_ROOT).

o getquota() requires that:

    If the type is USRQUOTA, either the effective uid match the
    requested quota ID, that the unprivileged_get_quota flag be
    set, or that the thread be privileged (PRISON_ROOT).

    If the type is GRPQUOTA, require that either the thread be
    a member of the group represented by the requested quota ID,
    that the unprivileged_get_quota flag be set, or that the
    thread be privileged (PRISON_ROOT).

o setquota() requires privilege (PRISON_ROOT).

o setuse() requires privilege (PRISON_ROOT).

o qsync() requires no special privilege (consistent with what
  was present before, but probably not very useful).

Add a new sysctl, security.bsd.unprivileged_get_quota, which when
set to a non-zero value, will permit unprivileged users to query user
quotas with non-matching uids and gids.  Set this to 0 by default
to be mostly consistent with the previous behavior (the same for
USRQUOTA, but not for GRPQUOTA).

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-15 06:36:19 +00:00
Poul-Henning Kamp
7652131bee Initialize struct vfsops C99-sparsely.
Submitted by:   hmp
Reviewed by:	phk
2003-06-12 20:48:38 +00:00
David E. O'Brien
f4636c5959 Use __FBSDID(). 2003-06-11 06:34:30 +00:00
Robert Watson
1e9e2eb598 Implement ffs_listextattr() by breaking out that logic and special-cased
attribute name of "" from ffs_getextattr().  Invoking VOP_GETETATTR()
with an empty name is now no longer supported; user application
compatibility is provided by a system call level compatibility
wrapper.  We make sure to explicitly reject attempts to set an EA
with the name "".

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-05 05:57:39 +00:00
Robert Watson
bd38ab57a1 Don't special-case handling of the empty string in the UFS1
extended attribute retrieval code: it's no longer special-cased,
and is caught by the normal UFS1 EA validity checks (and, in
fact, returns the same error, EINVAL).

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2003-06-05 04:58:58 +00:00
Robert Watson
e1249def7d Return EOPNOTSUPP for attempted EA operations on VCHR vnodes in UFS2;
if we permit them to occur, the kernel panics due to our performing
EA operations using VOP_STRATEGY on the vnode.  This went unnoticed
previously because there are very for users of device nodes on UFS2
due to the introduction of devfs.  However, this can come up with
the Linux compat directories and its hard-coded dev nodes (which will
need to go away as we move away from hard-coded device numbers).
This can come up if you use EA-intensive features such as ACLs and
MAC.

The proper fix is pretty complicated, but this band-aid would be
an excellent MFC candidate for the release.
2003-06-01 02:42:18 +00:00
Poul-Henning Kamp
61301f74d0 Remove unused variable.
Found by:       FlexeLint
2003-05-31 19:56:09 +00:00
Poul-Henning Kamp
6280ed26af Remove unused local variables.
Found by:       FlexeLint
2003-05-31 18:17:32 +00:00
Poul-Henning Kamp
17a1391990 The IO_NOWDRAIN and B_NOWDRAIN hacks are no longer needed to prevent
deadlocks with vnode backed md(4) devices because md now uses a
kthread to run the bio requests instead of doing it directly from
the bio down path.
2003-05-31 16:42:45 +00:00
Alan Cox
7f758dabbb Lock the vm object when performing vm_object_page_clean().
Approved by:	re (rwatson)
2003-05-18 22:02:51 +00:00
Robert Watson
62d4b85ec1 Jeff added locking assertions that the VV_ flags on vnodes were modified
only while holding appropriate vnode locks.  This patch slides the lock
release for ufs_extattr_enable() to continue to hold the active vnode lock
on a backing file until after the flag change; it also acquires a vnode
lock when disabling an attribute and hence clearing a flag on the backing
vnode.  This permits VFS_DEBUG_LOCKS to run UFS1 extended attributes
without panicking, as well as preventing a potential race and vnode flag
problem.

Approved by:	re (jhb)
Pointed out by:	DEBUG_VFS_LOCKS
2003-05-15 21:07:33 +00:00
Alan Cox
ad682c4825 Lock the vm_object on entry to vm_object_vndeallocate(). 2003-05-03 20:28:26 +00:00
Tim J. Robbins
3632928957 Do not attempt to free NULL dinodes (i_din1 or i_din2) in ffs_ifree().
These fields can be left as NULL if ffs_vget() allocates an inode but
fails before the dinode memory has been allocated. There are two cases
when this can occur: when we lose a race and another process has added
the inode to the hash, and when reading the inode off disk fails.

The bug was observed by Kris on one of the package-building machines.
See http://marc.theaimsgroup.com/?l=freebsd-current&m=105172731013411&w=2
In Kris's case, it was the bread() that failed because of a disk error.

The alternative to this patch is to ensure that ffs_vget() does not call
vput() when the inode that hasn't been properly initialised.
2003-05-01 06:41:59 +00:00
Tim J. Robbins
8d721e877d Free i_din2 instead of i_din1 in ffs_ifree() on UFS2 filesystems.
This is purely a cosmetic change because these members are in a
union together.
2003-05-01 06:38:27 +00:00
Mark Murray
51da11a27a Fix some easy, global, lint warnings. In most cases, this means
making some local variables static. In a couple of cases, this means
removing an unused variable.
2003-04-30 12:57:40 +00:00
Alexander Kabaev
104a9b7e3e Deprecate machine/limits.h in favor of new sys/limits.h.
Change all in-tree consumers to include <sys/limits.h>

Discussed on:	standards@
Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
2003-04-29 13:36:06 +00:00
John Baldwin
a15cc35909 Lock both the proc lock and sched_lock when calling sched_nice since
kg_nice is now protected by both.  Being protected by both means that
other places in the kernel that want to read kg_nice only need one of the
two locks.
2003-04-22 20:45:38 +00:00
Jeff Roberson
86711bae9b - Use the sched_nice() api instead of setting the nice value directly.
Tested by:	Steve Kargl <sgk@troutmask.apl.washington.edu>
2003-04-12 01:05:19 +00:00
Alan Cox
6134838f99 Sufficient access checks are performed by vmapbuf() that calling useracc()
is pointless.  Remove the call to useracc().

Don't reinitialize fields that are already initialized by getpbuf().

Reviewed by:	tegge
2003-04-06 19:26:30 +00:00
Tor Egge
5e2e6a67c4 Check return value from vmapbuf instead of the function address. 2003-03-27 20:48:34 +00:00
Tor Egge
10dccf8ff2 Eliminate a buffer sleep/wakeup race. 2003-03-27 19:28:11 +00:00
Tor Egge
5bbb806004 Add support for reading directly from file to userland buffer when the
O_DIRECT descriptor status flag is set and both offset and length is a
multiple of the physical media sector size.
2003-03-26 23:40:42 +00:00
John Baldwin
31566c96f4 Use td->td_ucred instead of td->td_proc->p_ucred. 2003-03-20 21:17:40 +00:00
John Baldwin
2a53bfbe62 Minor fixes to ffs_fserr():
- Assume that curthread is not NULL.  It never is in -current.
- Use td_ucred instead of p_ucred.
2003-03-20 21:15:54 +00:00
Poul-Henning Kamp
b4b138c27f Including <sys/stdint.h> is (almost?) universally only to be able to use
%j in printfs, so put a newsted include in <sys/systm.h> where the printf
prototype lives and save everybody else the trouble.
2003-03-18 08:45:25 +00:00
Jeff Roberson
09f11da5a3 - Remove a race between fsync like functions and flushbufqueues() by
requiring locked bufs in vfs_bio_awrite().  Previously the buf could
   have been written out by fsync before we acquired the buf lock if it
   weren't for giant.  The cluster_wbuild() handles this race properly but
   the single write at the end of vfs_bio_awrite() would not.
 - Modify flushbufqueues() so there is only one copy of the loop.  Pass a
   parameter in that says whether or not we should sync bufs with deps.
 - Call flushbufqueues() a second time and then break if we couldn't find
   any bufs without deps.
2003-03-13 07:19:23 +00:00
Kirk McKusick
34968037b1 Use the appropriate size when zeroing out the unused portion
of a snapshot's copy of a superblock. This patch fixes a panic
when taking a snapshot of a 4096/512 filesystem.

Reported by:	Ian Freislich <ianf@za.uu.net>
Sponsored by:   DARPA & NAI Labs.
2003-03-07 23:49:16 +00:00
Alan Cox
09c80124a3 Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress.
Discussed on:	arch@
2003-03-06 03:41:02 +00:00
Jeff Roberson
7261f5f68e - Add a new 'flags' parameter to getblk().
- Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT
   flag to the initial BUF_LOCK().  This will eventually be used in cases
   were we want to use a buffer only if it is not currently in use.
 - Convert all consumers of the getblk() api to use this extra parameter.

Reviwed by:	arch
Not objected to by:	mckusick
2003-03-04 00:04:44 +00:00
Nate Lawson
99648386d3 Finish cleanup of vprint() which was begun with changing v_tag to a string.
Remove extraneous uses of vop_null, instead defering to the default op.
Rename vnode type "vfs" to the more descriptive "syncer".
Fix formatting for various filesystems that use vop_print.
2003-03-03 19:15:40 +00:00
Dag-Erling Smørgrav
521f364b80 More low-hanging fruit: kill caddr_t in calls to wakeup(9) / [mt]sleep(9). 2003-03-02 16:54:40 +00:00
Kirk McKusick
74f3809a19 Change the field used to test whether the superblock has been updated
from the filesystem size field to the filesystem maximum blocksize
field. The problem is that older versions of growfs updated only the
new size field and not the old size field. This resulted in the old
(smaller) size field being copied up to the new size field which
caused the filesystem to appear to fsck to be badly trashed.

This also adds a sanity check to ensure that the superblock is not
being updated when the filesystem is mounted read-only. Obviously
such an update should never happen.

Reported by:	Nate Lawson <nate@root.org>
Sponsored by:   DARPA & NAI Labs.
2003-02-25 23:21:08 +00:00
Jeff Roberson
17661e5ac4 - Add an interlock argument to BUF_LOCK and BUF_TIMELOCK.
- Remove the buftimelock mutex and acquire the buf's interlock to protect
   these fields instead.
 - Hold the vnode interlock while locking bufs on the clean/dirty queues.
   This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another
   BUF_LOCK with a LK_TIMEFAIL to a single lock.

Reviewed by:	arch, mckusick
2003-02-25 03:37:48 +00:00
David Schultz
9cdb2d4d9d Expand the reference count on struct dquot to 32 bits.
This fixes a panic on large systems where a single user
may have more than 64K active or inactive vnodes.

PR:		48234
Reviewed by:	mike (mentor)
2003-02-24 08:49:59 +00:00
Kirk McKusick
3bf0ed940b When removing the last item from a non-empty worklist, the worklist
tail pointer must be updated.

Reported by:	Kris Kennaway <kris@obsecurity.org>
Sponsored by:   DARPA & NAI Labs.
2003-02-24 07:28:41 +00:00
Kirk McKusick
5bb651cb72 This patch fixes a deadlock between the bufdaemon and a process taking
a snapshot. As part of taking a snapshot of a filesystem, the kernel
builds up a list of the filesystem metadata (such as the cylinder
group bitmaps) that are contained in the snapshot. When doing a
copy-on-write check, the list is first consulted. If the block being
written is found on the list, then the full snapshot lookup can be
avoided. Besides providing an important performance speedup this
check also avoids a potential deadlock between the code creating
the snapshot and the bufdaemon trying to cleanup snapshot related
buffers. This fix creates a temporary list containing the key
metadata blocks that can cause the deadlock. This temporary list
is used between the time that the snapshot is first enabled and the
time that the fully complete list is built.

Reported by:	Attila Nagy <bra@fsn.hu>
Sponsored by:   DARPA & NAI Labs.
2003-02-22 00:59:34 +00:00
Kirk McKusick
37e2ebfdba This patch fixes a bug on an active filesystem on which a snapshot
is being taken from panicing with either "freeing free block" or
"freeing free inode". The problem arises when the snapshot code
is scanning the filesystem looking for inodes with a reference
count of zero (e.g., unlinked but still open) so that it can
expunge them from its view. If it encounters a reclaimed vnode
and has to restart its scan, then it will panic if it encounters
and tries to free an inode that it has already processed. The fix
is to check each candidate inode to see if it has already been
processed before trying to delete it from the snapshot image.

Sponsored by:   DARPA & NAI Labs.
2003-02-22 00:29:51 +00:00
Kirk McKusick
d60682c239 This patch fixes a bug in the logical block calculation macros so
that they convert to 64-bit values before shifting rather than
afterwards. Once fixed, they can be used rather than inline expanded.

Sponsored by:   DARPA & NAI Labs.
2003-02-22 00:19:26 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Kirk McKusick
aca3e4974f Replace use of random() with arc4random() to provide less guessable
values for the initial inode generation numbers in newfs and for
newly allocated inode generation numbers in the kernel.

Submitted by:	Theo de Raadt <deraadt@cvs.openbsd.org>
Sponsored by:   DARPA & NAI Labs.
2003-02-14 21:31:58 +00:00