Commit Graph

5856 Commits

Author SHA1 Message Date
rwatson
80fa5afdad Correct merge-o: disable the right execve() variation if !MAC 2002-11-05 18:04:50 +00:00
rwatson
6c4f4d26f4 Bring in two sets of changes:
(1) Permit userland applications to request a change of label atomic
    with an execve() via mac_execve().  This is required for the
    SEBSD port of SELinux/FLASK.  Attempts to invoke this without
    MAC compiled in result in ENOSYS, as with all other MAC system
    calls.  Complexity, if desired, is present in policy modules,
    rather than the framework.

(2) Permit policies to have access to both the label of the vnode
    being executed as well as the interpreter if it's a shell
    script or related UNIX nonsense.  Because we can't hold both
    vnode locks at the same time, cache the interpreter label.
    SEBSD relies on this because it supports secure transitioning
    via shell script executables.  Other policies might want to
    take both labels into account during an integrity or
    confidentiality decision at execve()-time.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-05 17:51:56 +00:00
rwatson
948267c75e Regen. 2002-11-05 17:48:04 +00:00
rwatson
0f637b25ea Flesh out the definition of __mac_execve(): per earlier discussion,
it's essentially execve() with an optional MAC label argument.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-05 17:47:08 +00:00
rwatson
373a915367 Assert that appropriate vnodes are locked in mac_execve_will_transition().
Allow transitioning to be twiddled off using the process and fs enforcement
flags, although at some point this should probably be its own flag.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-05 15:11:33 +00:00
rwatson
c2166f1034 Hook up the mac_will_execve_transition() and mac_execve_transition()
entrypoints, #ifdef MAC.  The supporting logic already existed in
kern_mac.c, so no change there.  This permits MAC policies to cause
a process label change as the result of executing a binary --
typically, as a result of executing a specially labeled binary.

For example, the SEBSD port of SELinux/FLASK uses this functionality
to implement TE type transitions on processes using transitioning
binaries, in a manner similar to setuid.  Policies not implementing
a notion of transition (all the ones in the tree right now) require
no changes, since the old label data is copied to the new label
via mac_create_cred() even if a transition does occur.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-05 14:57:49 +00:00
keramida
430eab0e43 Typo in comment: commmand -> command
Reviewed by:	jhb
2002-11-05 14:54:07 +00:00
rwatson
e05e16efa1 Remove reference to struct execve_args from struct imgact, which
describes an image activation instance.  Instead, make use of the
existing fname structure entry, and introduce two new entries,
userspace_argv, and userspace_envv.  With the addition of
mac_execve(), this divorces the image structure from the specifics
of the execve() system call, removes a redundant pointer, etc.
No semantic change from current behavior, but it means that the
structure doesn't depend on syscalls.master-generated includes.

There seems to be some redundant initialization of imgact entries,
which I have maintained, but which could probably use some cleaning
up at some point.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-05 01:59:56 +00:00
rwatson
b8dd64f5ef Permit MAC policies to instrument the access control decisions for
system accounting configuration and for nfsd server thread attach.
Policies might use this to protect the integrity or confidentiality
of accounting data, limit the ability to turn on or off accounting,
as well as to prevent inappropriately labeled threads from becoming nfs
server threads.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-04 15:13:36 +00:00
rwatson
8f2b40ef3f Remove mac_cache_fslabel_in_vnode sysctl -- with the new VFS/MAC
construction, labels are always cached.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-04 14:55:14 +00:00
rwatson
89cdee7c52 License clarification and wording changes: NAI has approved removal of
clause three, and NAI Labs now goes by the name Network Associates
Laboratories.
2002-11-04 01:42:39 +00:00
rwatson
7537530ad8 Introduce mac_check_system_settime(), a MAC check allowing policies to
augment the system policy for changing the system time.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-03 02:39:42 +00:00
rwatson
7c29b60fcb Regen from yesterday's system call placeholder rename. 2002-11-02 23:54:36 +00:00
alc
b3cfb0145e Catch up with the removal of the vm page buckets spin mutex. 2002-11-02 22:42:18 +00:00
alc
328836c626 Revert the change in revision 1.77 of kern/uipc_socket2.c. It is causing
a panic because the socket's state isn't as expected by sofree().

Discussed with: dillon, fenner
2002-11-02 05:14:31 +00:00
kbyanc
3cb0634996 Update the st_size reported via stat(2) to accurately reflect the amount
of data available to read for non-TCP sockets.

Reviewed by:	-net, -arch
Sponsored by:	NTT Multimedia Communications Labs
MFC after:	2 weeks
2002-11-01 21:31:13 +00:00
kbyanc
c70af01e80 Track the number of non-data chararacters stored in socket buffers so that
the data value returned by kevent()'s EVFILT_READ filter on non-TCP
sockets accurately reflects the amount of data that can be read from the
sockets by applications.

PR:		30634
Reviewed by:	-net, -arch
Sponsored by:	NTT Multimedia Communications Labs
MFC after:	2 weeks
2002-11-01 21:27:59 +00:00
rwatson
678a0e6331 Rename __execve_mac() to __mac_execve() for increased consistency
with other MAC system calls.

Requested by:	various (phk, gordont, jake, ...)
2002-11-01 21:00:02 +00:00
rwatson
61ffc1b9bb Add MAC checks for various kenv() operations: dump, get, set, unset,
permitting MAC policies to limit access to the kernel environment.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-11-01 20:46:53 +00:00
phk
27173bd0dd Introduce malloc_last_fail() which returns the number of seconds since
malloc(9) failed last time.  This is intended to help code adjust
memory usage to the current circumstances.

A typical use could be:
	if (malloc_last_fail() < 60)
		reduce_cache_by_one();
2002-11-01 18:58:12 +00:00
phk
3ce3aae31e Introduce a "time_uptime" global variable which holds the time since boot
in seconds.
2002-11-01 18:52:20 +00:00
davidxu
4eeb3d0411 KSE-enabled processes only. 2002-10-31 08:00:51 +00:00
rwatson
122a6b9ad2 Move to C99 sparse structure initialization for the mac_policy_ops
structure definition, rather than using an operation vector
we translate into the structure.  Originally, we used a vector
for two reasons:

(1) We wanted to define the structure sparsely, which wasn't
    supported by the C compiler for structures.  For a policy
    with five entry points, you don't want to have to stick in
    a few hundred NULL function pointers.

(2) We thought it would improve ABI compatibility allowing modules
    to work with kernels that had a superset of the entry points
    defined in the module, even if the kernel had changed its
    entry point set.

Both of these no longer apply:

(1) C99 gives us a way to sparsely define a static structure.

(2) The ABI problems existed anyway, due to enumeration numbers,
    argument changes, and semantic mismatches.  Since the going
    rule for FreeBSD is that you really need your modules to
    pretty closely match your kernel, it's not worth the
    complexity.

This submit eliminates the operation vector, dynamic allocation
of the operation structure, copying of the vector to the
structure, and redoes the vectors in each policy to direct
structure definitions.  One enourmous benefit of this change
is that we now get decent type checking on policy entry point
implementation arguments.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-30 18:48:51 +00:00
rwatson
27e2336a32 While 'mode_t' seemed like a good idea for the access mode argument for
MAC access() and open() checks, the argument actually has an int type
where it becomes available.  Switch to using 'int' for the mode argument
throughout the MAC Framework and policy modules.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-30 17:56:57 +00:00
davidxu
3ab60f4a38 Check NULL thread mailbox pointer. 2002-10-30 05:09:29 +00:00
davidxu
bdf12834c6 Style fixes. 2002-10-30 03:01:28 +00:00
davidxu
0d865413b8 Don't forget to set syscall result. 2002-10-30 02:39:10 +00:00
davidxu
7531bd3c2f Add an actual implementation of kse_thr_interrupt() 2002-10-30 02:28:41 +00:00
rwatson
8286401862 Minor comment typo fix.
Submitted by:	Wayne Morrison <tewok@tislabs.com>
2002-10-29 20:51:44 +00:00
dwmalone
6878955e08 The syscall names are string constants, so make them consts. 2002-10-29 15:47:06 +00:00
rwatson
687d4fe60e Trim extraneous #else and #endif MAC comments per style(9). 2002-10-28 21:17:53 +00:00
rwatson
9ae04d1a06 An inappropriate ASSERT slipped in during the recent merge of the
reboot checking; remove.
2002-10-28 18:53:53 +00:00
davidxu
fb65dc6cd6 Close a race window in kse_create(): signal delivered after SIGPENDING call
but before we call kse_link().
2002-10-28 07:37:06 +00:00
iedowse
092b51aeec Fix a case in kern_rename() where a vn_finished_write() call was
missed. This bug has been present since the vn_start_write() and
vn_finished_write() calls were first added in revision 1.159. When
the case is triggered, any attempts to create snapshots on the
filesystem will deadlock and also prevent further write activity
on that filesystem.
2002-10-27 23:23:51 +00:00
wollman
7e9d4df21f Change the way support for asynchronous I/O is indicated to applications
to conform to 1003.1-2001.  Make it possible for applications to actually
tell whether or not asynchronous I/O is supported.

Since FreeBSD's aio implementation works on all descriptor types, don't
call down into file or vnode ops when [f]pathconf() is asked about
_PC_ASYNC_IO; this avoids the need for every file and vnode op to know about
it.
2002-10-27 18:07:41 +00:00
rwatson
e6f3037210 Centrally manage enforcement of {reboot,swapon,sysctl} using the
mac_enforce_system toggle, rather than several separate toggles.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-27 15:50:49 +00:00
rwatson
653f637c44 Implement mac_check_system_sysctl(), a MAC Framework entry point to
permit MAC policies to augment the security protections on sysctl()
operations.  This is not really a wonderful entry point, as we
only have access to the MIB of the target sysctl entry, rather than
the more useful entry name, but this is sufficient for policies
like Biba that wish to use their notions of privilege or integrity
to prevent inappropriate sysctl modification.  Affects MAC kernels
only.  Since SYSCTL_LOCK isn't in sysctl.h, just kern_sysctl.c,
we can't assert the SYSCTL subsystem lockin the MAC Framework.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-27 07:12:34 +00:00
rwatson
be98961ae9 Hook up mac_check_system_reboot(), a MAC Framework entry point that
permits MAC modules to augment system security decisions regarding
the reboot() system call, if MAC is compiled into the kernel.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-27 07:03:29 +00:00
rwatson
8cd9e63819 Merge from MAC tree: rename mac_check_vnode_swapon() to
mac_check_system_swapon(), to reflect the fact that the primary
object of this change is the running kernel as a whole, rather
than just the vnode.  We'll drop additional checks of this
class into the same check namespace, including reboot(),
sysctl(), et al.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-27 06:54:06 +00:00
mux
65fb9af8bb Fix a style nit. 2002-10-26 18:19:46 +00:00
rwatson
312cab0dee Slightly change the semantics of vnode labels for MAC: rather than
"refreshing" the label on the vnode before use, just get the label
right from inception.  For single-label file systems, set the label
in the generic VFS getnewvnode() code; for multi-label file systems,
leave the labeling up to the file system.  With UFS1/2, this means
reading the extended attribute during vfs_vget() as the inode is
pulled off disk, rather than hitting the extended attributes
frequently during operations later, improving performance.  This
also corrects sematics for shared vnode locks, which were not
previously present in the system.  This chances the cache
coherrency properties WRT out-of-band access to label data, but in
an acceptable form.  With UFS1, there is a small race condition
during automatic extended attribute start -- this is not present
with UFS2, and occurs because EAs aren't available at vnode
inception.  We'll introduce a work around for this shortly.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-26 14:38:24 +00:00
julian
64467d2a2f iBack out david's last commit. the suspension code needs to be called
for non KSE processes too.
2002-10-26 04:44:17 +00:00
davidxu
9f183ef3fc Move suspension checking code from userret() into thread_userret(). 2002-10-26 02:56:51 +00:00
davidxu
f9c45007d3 Backout revision 1.48. 2002-10-26 01:26:36 +00:00
rwatson
068e73d389 Comment describing the semantics of mac_late.
Trim trailing whitespace.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-25 20:45:27 +00:00
mux
bdcd4ebb5d - Rename the DDB specific %z printf format to %y.
- Make DDB use %y instead of %z.
- Teach GCC about %y.
- Implement support for the C99 %z format modifier.

Approved by:	re@
Reviewed by:	peter
Tested on:	i386, sparc64
2002-10-25 19:41:32 +00:00
peter
f7fa86b743 Split 4.x and 5.x signal handling so that we can keep 4.x signal
handling clean and functional as 5.x evolves.  This allows some of the
nasty bandaids in the 5.x codepaths to be unwound.

Encapsulate 4.x signal handling under COMPAT_FREEBSD4 (there is an
anti-foot-shooting measure in place, 5.x folks need this for a while) and
finish encapsulating the older stuff under COMPAT_43.  Since the ancient
stuff is required on alpha (longjmp(3) passes a 'struct osigcontext *'
to the current sigreturn(2), instead of the 'ucontext_t *' that sigreturn
is supposed to take), add a compile time check to prevent foot shooting
there too.  Add uniform COMPAT_43 stubs for ia64/sparc64/powerpc.

Tested on: i386, alpha, ia64.  Compiled on sparc64 (a few days ago).
Approved by: re
2002-10-25 19:10:58 +00:00
phk
097ab10d4b #include <geom/geom.h> to get proper prototypes. Contrary to my fears we
seem to have all the prerequisites already.

Call g_waitidle() as the first thing in vfs_mountroot() so that we have
it out of the way before we even decide if we should call .._ask() or
.._try().

Call the g_dev_print() function to provide better guidance for the
root-mount prompt.
2002-10-25 18:44:42 +00:00
davidxu
3f4f4ce169 suspend thread only when it can be interrupted. 2002-10-25 13:12:36 +00:00
davidxu
de4094aa31 let thread_schedule_upcall() handle idle kse. 2002-10-25 12:50:31 +00:00
phk
b6e6ea6570 Disable the kernacc() check in mtx_validate() until such time that kernacc
does not require Giant.

This means that we may miss panics on a class of mutex programming bugs,
but only if running with a Chernobyl setting of debug-flags.

Spotted by:	Pete Carah <pete@ns.altadena.net>
2002-10-25 08:40:20 +00:00
phk
7320ebcf72 In vrele() we can actually have a VCHR with v_rdev == NULL if we
came from the bottom of addaliasu().  Don't panic.
2002-10-25 07:58:25 +00:00
julian
35e17a8e76 fix style-o 2002-10-25 07:17:07 +00:00
julian
4ea837f673 More work on the interaction between suspending and sleeping threads.
Also clean up some code used with 'single-threading'.

Reviewed by:	davidxu
2002-10-25 07:11:12 +00:00
mckusick
6b1611bd94 Within ufs, the ffs_sync and ffs_fsync functions did not always
check for and/or report I/O errors. The result is that a VFS_SYNC
or VOP_FSYNC called with MNT_WAIT could loop infinitely on ufs in
the presence of a hard error writing a disk sector or in a filesystem
full condition. This patch ensures that I/O errors will always be
checked and returned.  This patch also ensures that every call to
VFS_SYNC or VOP_FSYNC with MNT_WAIT set checks for and takes
appropriate action when an error is returned.

Sponsored by:   DARPA & NAI Labs.
2002-10-25 00:20:37 +00:00
davidxu
776a2129fe fix typo. 2002-10-25 00:13:46 +00:00
julian
59e8ad5a4c Extract out KSE specific code from machine specific code
so that there is ony one copy of it. Fix that one copy
so that KSEs with no mailbox in a KSE program are not a cause
of page faults (this can legitmatly happen).

Submitted by:	(parts) davidxu
2002-10-24 23:09:48 +00:00
phk
a3766d9d16 Fix the spechash lock order reversal by keeping an updated sum
of v_usecount in the dev_t which vcount() can return without
locking any vnodes.

Seen by:	jhb
2002-10-24 19:38:56 +00:00
phk
2419eff62c Make sure GEOM has stopped rattling the disks before we try to mount
the root filesystem, this may be implicated in the PC98 issue.
2002-10-24 19:26:08 +00:00
phk
8ac0d1e756 Don't try to be cute and save a call/return by implementing a degenerate
vrele() inline.
2002-10-24 17:55:49 +00:00
davidxu
1dbc90aa75 respect TDF_SINTR, also for SINGLE_NO_EXIT threading mode, if a thread
was already suspended, do nothing.
2002-10-24 14:43:48 +00:00
davidxu
1ad7602cf0 don't forget to remove kse from idle queue. 2002-10-24 09:16:46 +00:00
julian
842e39ccd8 Move thread related code from kern_proc.c to kern_thread.c.
Add code to free KSEs and KSEGRPs on exit.
Sort KSE prototypes in proc.h.
Add the missing kse_exit() syscall.

ksetest now does not leak KSEs and KSEGRPS.

Submitted by:	(parts) davidxu
2002-10-24 08:46:34 +00:00
des
ceed53ef32 Whitespace cleanup. 2002-10-23 10:26:54 +00:00
kan
f8332b9941 Handle binaries with arbitrary number PT_LOAD sections, not only
ones with one text and one data section.

The text and data rlimit checks still needs to be fixed to properly
accout for additional sections.

Reviewed by:	peter (slightly different patch version)
2002-10-23 01:57:39 +00:00
jhb
1cc1792ab5 Don't dereference the 'x' pointer if it is NULL, instead skip the
assignment.  The netsmb code likes to call these functions with a NULL
x argument a lot.

Reported by:	Vallo Kallaste <kalts@estpak.ee>
2002-10-22 18:44:59 +00:00
robert
dedc53fcbe Change the `mutex_prof' structure to use three variables contained
in an anonymous structure as counters, instead of an array with
preprocessor-defined names for indices.  Remove the associated XXX-
comment.
2002-10-22 16:06:28 +00:00
rwatson
e40371a8f9 Introduce MAC_CHECK_VNODE_SWAPON, which permits MAC policies to
perform authorization checks during swapon() events; policies
might choose to enforce protections based on the credential
requesting the swap configuration, the target of the swap operation,
or other factors such as internal policy state.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-22 15:53:43 +00:00
rwatson
9fe777b3e6 Missed in previous merge: export sizeof(struct oldmac) rather than
sizeof(struct mac).

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-22 15:33:33 +00:00
rwatson
4651fb3eba Support the new MAC user API in kernel: modify existing system calls
to use a modified notion of 'struct mac', and flesh out the new variation
system calls (almost identical to existing ones except that they permit
a pid to be specified for process label retrieval, and don't follow
symlinks).  This generalizes the label API so that the framework is
now almost entirely policy-agnostic.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-22 14:29:47 +00:00
rwatson
d560423432 Regen. 2002-10-22 14:23:52 +00:00
rwatson
a3ad68f14a Flesh out prototypes for __mac_get_pid, __mac_get_link, and
__mac_set_link, based on __mac_get_proc() except with a pid,
and __mac_get_file(), __mac_set_file() except that they do
not follow symlinks.  First in a series of commits to flesh
out the user API.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-22 14:22:24 +00:00
davidxu
94b30e0ab5 detect idle kse correctly. 2002-10-22 02:27:19 +00:00
mckusick
76a6cc0dc1 This update removes a race between unmount and lookup. The lookup
locks the mount point directory while waiting for vfs_busy to clear.
Meanwhile the unmount which holds the vfs_busy lock tried to lock
the mount point vnode. The fix is to observe that it is safe for the
unmount to remove the vnode from the mount point without locking it.
The lookup will wait for the unmount to complete, then recheck the
mount point when the vfs_busy lock clears.

Sponsored by:	DARPA & NAI Labs.
2002-10-22 01:06:44 +00:00
mckusick
305e5868f3 This checkin reimplements the io-request priority hack in a way
that works in the new threaded kernel. It was commented out of
the disksort routine earlier this year for the reasons given in
kern/subr_disklabel.c (which is where this code used to reside
before it moved to kern/subr_disk.c):

----------------------------
revision 1.65
date: 2002/04/22 06:53:20;  author: phk;  state: Exp;  lines: +5 -0
Comment out Kirks io-request priority hack until we can do this in a
civilized way which doesn't cause grief.

The problem is that it is not generally safe to cast a "struct bio
*" to a "struct buf *".  Things like ccd, vinum, ata-raid and GEOM
constructs bio's which are not entrails of a struct buf.

Also, curthread may or may not have anything to do with the I/O request
at hand.

The correct solution can either be to tag struct bio's with a
priority derived from the requesting threads nice and have disksort
act on this field, this wouldn't address the "silly-seek syndrome"
where two equal processes bang the diskheads from one edge to the
other of the disk repeatedly.

Alternatively, and probably better: a sleep should be introduced
either at the time the I/O is requested or at the time it is completed
where we can be sure to sleep in the right thread.

The sleep also needs to be in constant timeunits, 1/hz can be practicaly
any sub-second size, at high HZ the current code practically doesn't
do anything.
----------------------------

As suggested in this comment, it is no longer located in the disk sort
routine, but rather now resides in spec_strategy where the disk operations
are being queued by the thread that is associated with the process that
is really requesting the I/O. At that point, the disk queues are not
visible, so the I/O for positively niced processes is always slowed
down whether or not there is other activity on the disk.

On the issue of scaling HZ, I believe that the current scheme is
better than using a fixed quantum of time. As machines and I/O
subsystems get faster, the resolution on the clock also rises.
So, ten years from now we will be slowing things down for shorter
periods of time, but the proportional effect on the system will
be about the same as it is today. So, I view this as a feature
rather than a drawback. Hence this patch sticks with using HZ.

Sponsored by:	DARPA & NAI Labs.
Reviewed by:	Poul-Henning Kamp <phk@critter.freebsd.dk>
2002-10-22 00:59:49 +00:00
phk
687253b085 GEOM does not (and shall not) propagate flags like D_MEMDISK, so we will
revert to checking the name to determine if our root device is a ramdisk,
md(4) specifically to determine if we should attempt the root-mount RW

Sponsored by:	DARPA & NAI Labs.
2002-10-21 20:09:59 +00:00
des
d93c97ce51 Reduce the overhead of the mutex statistics gathering code, try to produce
shorter lines in the report, and clean up some minor style issues.
2002-10-21 18:48:28 +00:00
cognet
54f5e2ef60 One #include <sys/sysctl.h> should be enough.
Approved by:	mux (mentor)
2002-10-21 18:40:40 +00:00
brooks
3eee68d184 Use if_printf(ifp, "blah") instead of
printf("%s%d: blah", ifp->if_name, ifp->if_xname).
2002-10-21 02:51:56 +00:00
tmm
f70d233938 Fix the calculations of the length of the unread message buffer
contents. The code was subtracting two unsigned ints, stored the
result in a log and expected it to be the same as of a signed
subtraction; this does only work on platforms where int and long
have the same size (due to overflows).
Instead, cast to long before the subtraction; the numbers are
guaranteed to be small enough so that there will be no overflows
because of that.
2002-10-20 23:13:05 +00:00
phk
3a05c0f089 We have memset() and memcpy() in the kernel now, so we don't need to
#define them to bzero and bcopy.

Spotted by:	FlexeLint
2002-10-20 22:33:42 +00:00
julian
34c14a0b1d Add an actual implementation of kse_wakeup()
Submitted by:	Davidxu
2002-10-20 21:08:47 +00:00
tmm
5c94fe601c Add kernel dump support, based on the ia64 version (which was committed
as sparc64/sparc64/dump_machdep.c a while back).
Other than ia64 (which uses ELF), sparc64 uses a homegrown format for
the dumps (headers are required because the physical address and size of
the tsb must be noted, and because physical memory may be discontiguous);
ELF would not offer any advantages here.

Reviewed by:	jake
2002-10-20 17:03:15 +00:00
phk
5a491baa51 #unifdef the code for checking blessed lock collisions until we need it.
Spotted by:	DARPA & NAI Labs.
2002-10-20 08:48:39 +00:00
rwatson
d9307dee7e If MAC_MAX_POLICIES isn't defined, don't try to define it, just let the
compile fail.  MAC_MAX_POLICIES should always be defined, or we have
bigger problems at hand.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-20 03:41:09 +00:00
peter
a75c662939 Stake a claim on 418 (__xstat), 419 (__xfstat), 420 (__xlstat) 2002-10-19 22:25:31 +00:00
peter
6f9d4eb337 Grab 416/417 real estate before I get burned while testing again.
This is for the not-quite-ready signal/fpu abi stuff.  It may not see
the light of day, but I'm certainly not going to be able to validate it
when getting shot in the foot due to syscall number conflicts.
2002-10-19 22:09:23 +00:00
rwatson
615a42874b Add a new 'NOMACCHECK' flag to namei() NDINIT flags, which permits the
caller to indicate that MAC checks are not required for the lookup.
Similar to IO_NOMACCHECK for vn_rdwr(), this indicates that the caller
has already performed all required protections and that this is an
internally generated operation.  This will be used by the NFS server
code, as we don't currently enforce MAC protections against requests
delivered via NFS.

While here, add NOCROSSMOUNT to PARAMASK; apparently this was used at
one point for name lookup flag checking, but isn't any longer or it
would have triggered from the NFS server code passing it to indicate
that mountpoints shouldn't be crossed in lookups.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-19 21:25:51 +00:00
rwatson
4c2d375ca4 Regen from addition of execve_mac placeholder. 2002-10-19 21:15:10 +00:00
rwatson
f3cd77cf07 Add a placeholder for the execve_mac() system call, similar to SELinux's
execve_secure() system call, which permits a process to pass in a label
for a label change during exec.  This permits SELinux to change the
label for the resulting exec without a race following a manual label
change on the process.  Because this interface uses our general purpose
MAC label abstraction, we call it execve_mac(), and wrap our port of
SELinux's execve_secure() around it with appropriate sid mappings.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-19 21:06:57 +00:00
rwatson
ae81971478 Drop in the MAC check for file creation as part of open().
Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-19 20:56:44 +00:00
rwatson
08f5fce118 Make sure to clear the 'registered' flag for MAC policies when they
unregister.  Under some obscure (perhaps demented) circumstances,
this can result in a panic if a policy is unregistered, and then someone
foolishly unregisters it again.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-19 20:30:12 +00:00
rwatson
91dee1ecbb Hook up most of the MAC entry points relating to file/directory/node
creation, deletion, and rename.  There are one or two other stray
cases I'll catch in follow-up commits (such as unix domain socket
creation); this permits MAC policy modules to limit the ability to
perform these operations based on existing UNIX credential / vnode
attributes, extended attributes, and security labels.  In the rename
case using MAC, we now have to lock the from directory and file
vnodes for the MAC check, but this is done only in the MAC case,
and the locks are immediately released so that the remainder of the
rename implementation remains the same.  Because the create check
takes a vattr to know object type information, we now initialize
additional fields in the VATTR passed to VOP_SYMLINK() in the MAC
case.

Approved by:	re
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-19 20:25:57 +00:00
marcel
d35d608c07 Add two hooks to signal module load and module unload to MD code.
The primary reason for this is to allow MD code to process machine
specific attributes, segments or sections in the ELF file and
update machine specific state accordingly. An immediate use of this
is in the ia64 port where unwind information is updated to allow
debugging and tracing in/across modules. Note that this commit
does not add the functionality to the ia64 port. See revision 1.9
of ia64/ia64/elf_machdep.c.

Validated on: alpha, i386, ia64
2002-10-19 19:16:03 +00:00
marcel
c5e66cd2c6 Reduce code duplication by moving the common actions in
link_elf_init(), link_elf_link_preload_finish() and
link_elf_load_file() to link_elf_link_common_finish().
Since link_elf_init() did initializations as a side-effect
of doing the common actions, keep the initialization in
that function. Consequently, link_elf_add_gdb() is now also
called to insert the very first link_map() (ie the kernel).
2002-10-19 18:59:33 +00:00
marcel
6af943a484 Non-functional change in preparation of the next commit:
Move link_elf_add_gdb(), link_elf_delete_gdb() and link_elf_error()
near the top of the file. The *_gdb() functions are moved inside
the #ifdef DDB already present there.
2002-10-19 18:43:37 +00:00
marcel
9d60b8c542 In link_elf_load_file(), when SPARSE_MAPPING is defined and we
cannot allocate ef->object, we freed ef before bailing out with
an error. This is wrong because ef=lf and when we have an error
and lf is non-NULL (which holds if we try to alloc ef->object),
we free lf and thus ef as part of the bailing-out.
2002-10-19 05:01:54 +00:00
alfred
657dbda997 Don't leak memory in semop(2). (Fix a bug I introduced in rev 1.55.)
Detective work by: jake
2002-10-19 02:07:35 +00:00
jhb
0ddab31832 Do not lock the process when calling fdfree() (this would have recursed on
a non-recursive lock, the proc lock, before) since we don't need it to
change p_fd.
2002-10-18 17:45:41 +00:00
jhb
e7e57ceb22 fdfree() clears p_fd for us, no need to do it again. 2002-10-18 17:44:39 +00:00
jhb
22c558ff8c Don't lock the proc lock to clear p_fd. p_fd isn't protected by the proc
lock.
2002-10-18 17:42:28 +00:00
mckusick
71c09f6a0a Have lockinit() initialize the debugging fields of a lock
when DEBUG_LOCKS is defined.

Sponsored by:	DARPA & NAI Labs.
2002-10-18 01:34:10 +00:00
mckusick
be6aa51161 When the number of dirty buffers rises too high, the buf_daemon runs
to help clean up. After selecting a potential buffer to write, this
patch has it acquire a lock on the vnode that owns the buffer before
trying to write it. The vnode lock is necessary to avoid a race with
some other process holding the vnode locked and trying to flush its
dirty buffers. In particular, if the vnode in question is a snapshot
file, then the race can lead to a deadlock. To avoid slowing down the
buf_daemon, it does a non-blocking lock request when trying to lock
the vnode. If it fails to get the lock it skips over the buffer and
continues down its queue looking for buffers to flush.

Sponsored by:	DARPA & NAI Labs.
2002-10-18 01:29:59 +00:00
sobomax
e6dad384fb Separate fiels reported by disk_err() with spaces, so that output doesn't
look cryptic.

MFC after:	1 week
2002-10-17 23:48:29 +00:00
robert
399c4103ec Instead of (sizeof(source_buffer) - 1) bytes, copy at most
(sizeof(destination_buffer) - 1) bytes into the destination buffer.
This was not harmful because they currently both provide space for
(MAXCOMLEN + 1) bytes.
2002-10-17 21:02:02 +00:00
robert
1e0cdb534a Use strlcpy() instead of strncpy() to copy NUL terminated strings
for safety and consistency.
2002-10-17 20:03:38 +00:00
sam
bd5a0b8cba fix kldload error return when a module is rejected because it's statically
linked in the kernel.  When this condition is detected deep in the linker
internals the EEXIST error code that's returned is stomped on and instead
an ENOEXEC code is returned.  This makes apps like sysinstall bitch.
2002-10-17 17:28:57 +00:00
robert
98f9976c41 - Allocate only enough space for a temporary buffer to hold
the path including the terminating NUL character from
   `struct sockaddr_un' rather than SOCK_MAXADDRLEN bytes.
 - Use strlcpy() instead of strncpy() to copy strings.
2002-10-17 15:52:42 +00:00
bmilekic
efa86acc11 Fix a fairly subtle bug in mbuf_init() where the reference counter
contiguous space was being allocated from the clust_map
instead of the mbuf_map as the comments indicated.  This resulted in
some address space wastage in mbuf_map.

Submitted by: Rohit Jalan <rohjal@yahoo.co.in>
2002-10-16 19:59:08 +00:00
jhb
a915592b8d Add a missing PROC_UNLOCK in ptrace() for the PT_IO case.
PR:		kern/44065
Submitted by:	Mark Kettenis <kettenis@chello.nl>
2002-10-16 16:28:33 +00:00
jhb
00200a73ce Many style and whitespace fixes.
Submitted by:	bde (mostly)
2002-10-16 15:45:37 +00:00
jhb
397b9f8a22 Sort includes a bit.
Submitted by:	bde
2002-10-16 15:14:31 +00:00
phk
3d99bae725 Be consistent about funtions being static.
Spotted by:	FlexeLint
2002-10-16 10:42:13 +00:00
sam
2a86be217a Replace aux mbufs with packet tags:
o instead of a list of mbufs use a list of m_tag structures a la openbsd
o for netgraph et. al. extend the stock openbsd m_tag to include a 32-bit
  ABI/module number cookie
o for openbsd compatibility define a well-known cookie MTAG_ABI_COMPAT and
  use this in defining openbsd-compatible m_tag_find and m_tag_get routines
o rewrite KAME use of aux mbufs in terms of packet tags
o eliminate the most heavily used aux mbufs by adding an additional struct
  inpcb parameter to ip_output and ip6_output to allow the IPsec code to
  locate the security policy to apply to outbound packets
o bump __FreeBSD_version so code can be conditionalized
o fixup ipfilter's call to ip_output based on __FreeBSD_version

Reviewed by:	julian, luigi (silent), -arch, -net, darren
Approved by:	julian, silence from everyone else
Obtained from:	openbsd (mostly)
MFC after:	1 month
2002-10-16 01:54:46 +00:00
phk
55f0babdd0 Plug a memory-leak.
"I think you're right" by:	jake
2002-10-15 18:58:38 +00:00
phk
2208595682 Use ; not , as statement separator in PDEBUG() macro.
Ignoring a NULL dev in device_set_ivars() sounds wrong, KASSERT it to
non-NULL instead.

Do the same for device_get_ivars() for reasons of symmetry, though
it probably would have yielded a panic anyway, this gives more precise
diagnostics.

Absentmindedly nodded OK to by:	jhb
2002-10-15 18:56:13 +00:00
jhb
28b39014f9 Argh. Put back setting of P_ADVLOCK for the F_WRLCK case that was
accidentally lost in the previous revision.

Submitted by:	bde
Pointy hat to:	jhb
2002-10-15 18:10:13 +00:00
marcel
e92a317846 Fix kernel module loading on ia64. Cross-module function calls
were improperly relocated due to faulty logic in lookup_fdesc()
in elf_machdep.c. The symbol index (symidx) was bogusly used for
load modules other than the one the relocation applied to. This
resulted in bogus bindings and consequently runtime failures.

The fix is to use the symbol index only for the module being
relocated and to use the symbol name for look-ups in the
modules in the dependent list. As such, we need a function to
return the symbol name given the linker file and symbol index.
2002-10-15 05:40:07 +00:00
peter
805d2c9f3e Restore pointer that was removed in 1.128. This wasn't a merge-o. 2002-10-15 01:36:45 +00:00
jhb
e1100fc10b - Add a new global mutex 'ppeers_lock' to protect the p_peers list of
processes forked with RFTHREAD.
- Use a goto to a label for common code when exiting from fork1() in case
  of an error.
- Move the RFTHREAD linkage setup code later in fork since the ppeers_lock
  cannot be locked while holding a proc lock.  Handle the race of a task
  leader exiting and killing its peers while a peer is forking a new child.
  In that case, go ahead and let the peer process proceed normally as the
  parent is about to kill it.  However, the task leader may have already
  gone to sleep to wait for the peers to die, so the new child process may
  not receive a SIGKILL from the task leader.  Rather than try to destruct
  the new child process, just go ahead and send it a SIGKILL directly and
  add it to the p_peers list.  This ensures that the task leader will wait
  until both the peer process doing the fork() and the new child process
  have received their KILL signals and exited.

Discussed with:	truckman (earlier versions)
2002-10-15 00:14:32 +00:00
jhb
db8a406c24 Remove the leaderp variable and just access p_leader directly. The
p_leader field is not protected by the proc lock but is only set during
fork1() by the parent process and never changes.
2002-10-15 00:03:40 +00:00
alfred
c1ef60cbb3 Remove a KASSERT I added in 1.73 to catch uninitialized pipes.
It must be removed because it is done without the pipe being locked
via pipelock() and therefore is vulnerable to races with pipespace()
erroneously triggering it by temporarily zero'ing out the structure
backing the pipe.

It looks as if this assertion is not needed because all manipulation
of the data changed by pipespace() _is_ protected by pipelock().

Reported by: kris, mckusick
2002-10-14 21:15:04 +00:00
julian
10d4187daf Did you ever notice how stupid bugs show up much clearer
when you see them in a commit message?
2002-10-14 20:43:02 +00:00
julian
9ce470a533 Tidy up the scheduler's code for changing the priority of a thread.
Logically pretty much a NOP.
2002-10-14 20:34:31 +00:00
mckusick
9d00a4a781 When scanning the freelist looking for candidate vnodes to recycle,
be sure to exit the loop with vp == NULL if no candidates are found.
Formerly, this bug would cause the last vnode inspected to be used,
even if it was not available. The result was a panic "vn_finished_write:
neg cnt".

Sponsored by:	DARPA & NAI Labs.
2002-10-14 19:54:39 +00:00
mckusick
05ff8976a7 Unconditionally reset vp->v_vnlock back to the default in the
vclean() function (e.g., vp->v_vnlock = &vp->v_lock) rather
than requiring filesystems that use alternate locks to do so
in their vop_reclaim functions. This change is a further cleanup
of the vop_stdlock interface.

Submitted by:	Poul-Henning Kamp <phk@critter.freebsd.dk>
Sponsored by:	DARPA & NAI Labs.
2002-10-14 19:44:51 +00:00
phk
8762cef261 Populate more fields of the disklabel for PC98.
Submitted by:	Kawanobe Koh <kawanobe@st.rim.or.jp>
2002-10-14 14:22:29 +00:00
mckusick
25230d4c6a Regularize the vop_stdlock'ing protocol across all the filesystems
that use it. Specifically, vop_stdlock uses the lock pointed to by
vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to
reference vp->v_lock. Filesystems that wish to use the default
do not need to allocate a lock at the front of their node structure
(as some still did) or do a lockinit. They can simply start using
vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks,
but still use the vop_stdlock functions (such as nullfs) can simply
replace vp->v_vnlock with a pointer to the lock that they wish to
have used for the vnode. Such filesystems are responsible for
setting the vp->v_vnlock back to the default in their vop_reclaim
routine (e.g., vp->v_vnlock = &vp->v_lock).

In theory, this set of changes cleans up the existing filesystem
lock interface and should have no function change to the existing
locking scheme.

Sponsored by:	DARPA & NAI Labs.
2002-10-14 03:20:36 +00:00
alc
d7c3f24729 Eliminate the unnecessary clearing of flag bits that are already clear
in lio_listio(2).
2002-10-14 01:21:37 +00:00
mike
ca4781fcfe Update a sysctl to use _POSIX_VERSION from <sys/unistd.h>, instead of
the kernel option _KPOSIX_VERSION.
2002-10-13 14:26:29 +00:00
mike
86f382153e Include <sys/_posix.h> directly instead of depending on <sys/proc.h>
to include <sys/signal.h> to include <sys/_posix.h>.
2002-10-13 11:54:16 +00:00
alfred
f178c67fb0 whitespace fixes. 2002-10-12 22:26:41 +00:00
jeff
ef4d4e378e - Create a new scheduler api that is defined in sys/sched.h
- Begin moving scheduler specific functionality into sched_4bsd.c
 - Replace direct manipulation of scheduler data with hooks provided by the
   new api.
 - Remove KSE specific state modifications and single runq assumptions from
   kern_switch.c

Reviewed by:	-arch
2002-10-12 05:32:24 +00:00
peter
439214de2e Register the machine check private state spinlock on ia64. 2002-10-12 00:33:36 +00:00
jhb
6d8bd89354 - Move the 'done1' label down below the unlock of the proc lock and move
the locking of the proc lock after the goto to done1 to avoid locking
  the lock in an error case just so we can turn around and unlock it.
- Move the exec_setregs() stuff out from under the proc lock and after
  the p_args stuff.  This allows exec_setregs() to be able to sleep or
  write things out to userland, etc. which ia64 does.

Tested by:	peter
2002-10-11 21:04:01 +00:00
jhb
27f6c5f7cc Fix %z to always print values as signed like it is supposed to.
Reviewed by:	bde
Tested on:	i386 in ddb
2002-10-11 17:54:55 +00:00
mike
8630abe45f Change iov_base's type from char *' to the standard void *'. All
uses of iov_base which assume its type is `char *' (in order to do
pointer arithmetic) have been updated to cast iov_base to `char *'.
2002-10-11 14:58:34 +00:00
phk
ae4ae707ad Remove an unused variable. 2002-10-11 10:36:22 +00:00
mckusick
16ad96c43c When considering a vnode for reuse in getnewvnode, we call
vcanrecycle to check a free vnode's availability. If it is
available, vcanrecycle returns an error code of zero and the
vnode in question locked. The getnewvnode routine then used
to call vn_start_write with the V_NOWAIT flag. If the filesystem
was suspended while taking a snapshot, the vn_start_write would
fail but getnewvnode would fail to unlock the vnode, instead
leaving it locked on the freelist. The result would be that the
vnode would be locked forever and would eventually hang the
system with a race to the root when it was attempted to recycle
it. This fix moves the vn_start_write check into vcanrecycle
where it will properly unlock the vnode if it is unavailable
for recycling due to filesystem suspension.

Sponsored by:	DARPA & NAI Labs.
2002-10-11 01:04:14 +00:00
rwatson
683f085f65 Incremental style improvements: more consistently avoid assignments
in conditionals; remove some excess vertical whitespace; remove a
bug in the return handling of the delete_vp() case for MAC.

Spotted by:	bde
2002-10-10 13:59:58 +00:00
rwatson
9f9e137cd7 Regen from syntax fix to syscalls.master.
PR:
Submitted by:
Reviewed by:
Approved by:
Obtained from:
MFC after:
2002-10-10 04:08:11 +00:00
rwatson
970f7235da Fix what looks like a merge-o from a conflict in the last commit to
syscalls.master.
2002-10-10 04:02:49 +00:00
rwatson
f1296e0875 Explore new heights in alphabetization for _file and _fd variations on
the extended attribute system calls.
2002-10-10 00:32:08 +00:00
peter
c4c1331bc1 Add a pointer to the alternate syscall tables on 64 bit platforms. 2002-10-09 22:04:09 +00:00
rwatson
48070e93ee Implement extattr_{delete,get,set}_link() system calls: extended attribute
operations that do not follow links.  Sync to MAC tree.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-09 21:48:22 +00:00
rwatson
6f14c978b7 Regen. 2002-10-09 21:47:29 +00:00
rwatson
e3360b2da4 Flesh out the extattr_{delete,get,set}_link() system calls: variations
on the _file() theme that do not follow symlinks.  Sync to MAC tree.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-09 21:47:04 +00:00
jhb
7cc0ed53c2 - Move p_cpulimit to struct proc from struct plimit and protect it with
sched_lock.  This means that we no longer access p_limit in mi_switch()
  and the p_limit pointer can be protected by the proc lock.
- Remove PRS_ZOMBIE check from CPU limit test in mi_switch().  PRS_ZOMBIE
  processes don't call mi_switch(), and even if they did there is no longer
  the danger of p_limit being NULL (which is what the original zombie check
  was added for).
- When we bump the current processes soft CPU limit in ast(), just bump the
  private p_cpulimit instead of the shared rlimit.  This fixes an XXX for
  some value of fix.  There is still a (probably benign) bug in that this
  code doesn't check that the new soft limit exceeds the hard limit.

Inspired by:	bde (2)
2002-10-09 17:17:24 +00:00
julian
6b6ba96b60 Round out the facilty for a 'bound' thread to loan out its KSE
in specific situations. The owner thread must be blocked, and the
borrower can not proceed back to user space with the borrowed KSE.
The borrower will return the KSE on the next context switch where
teh owner wants it back. This removes a lot of possible
race conditions and deadlocks. It is consceivable that the
borrower should inherit the priority of the owner too.
that's another discussion and would be simple to do.

Also, as part of this, the "preallocatd spare thread" is attached to the
thread doing a syscall rather than the KSE. This removes the need to lock
the scheduler when we want to access it, as it's now "at hand".

DDB now shows a lot mor info for threaded proceses though it may need
some optimisation to squeeze it all back into 80 chars again.
(possible JKH project)

Upcalls are now "bound" threads, but "KSE Lending" now means that
other completing syscalls can be completed using that KSE before the upcall
finally makes it back to the UTS. (getting threads OUT OF THE KERNEL is
one of the highest priorities in the KSE system.) The upcall when it happens
will present all the completed syscalls to the KSE for selection.
2002-10-09 02:33:36 +00:00
imp
56fe29cbbf Introducing /dev/devctl. This device reports events in the
configuration device hierarchy.  Device arrival, departure and not
matched are presently reported.  This will be the basis for devd, which
I still need to polish a little more before I commit it.  If you don't
use /dev/devctl, it will be a noop.
2002-10-07 23:17:44 +00:00
imp
9aab9af803 Two minor bugfixes:
o Allow the bus_debug variable to be set via the bus.debug tunable.
o Return pnpinfo and location info via the devinfo interface to userland.
  devinfo(8) needs to be updated to print it.
2002-10-07 23:15:40 +00:00
iedowse
cbd79f434a Add back a fdrop() call at the end of kern_open() that got lost in
revision 1.218. This bug caused a "struct file" reference to be
leaked if VOP_ADVLOCK(), vn_start_write(), or mac_check_vnode_write()
failed during the open operation.

PR:		kern/43739
Reported by:	Arne Woerner <woerner@mediabase-gmbh.de>
2002-10-07 20:49:22 +00:00
imp
f684519668 Add wrappers around the newly created bus_child_pnpinfo_str and
bus_child_location_str.
2002-10-07 07:08:00 +00:00
imp
95e1185856 Minor string handling cleanup that I've had in my tree for a while:
Don't use snprintf where strlcpy() will do the job.
Also, a NUL is '\0' not 0 in our style (C doesn't care), so spell it like.
Remove useless {} and () in the general area of this change.
2002-10-07 06:50:35 +00:00
imp
c6c68889b2 Don't need to NUL terminate after snprintf 2002-10-07 06:26:17 +00:00
imp
b88d741bf8 Add two interfaces to allow for busses to report the pnpinfo for
devices as well as their location on the bus.
2002-10-07 05:06:38 +00:00
alfred
54633f4bcc disable debug output by default. 2002-10-07 04:13:21 +00:00
rwatson
1f2df65750 Integrate mac_check_socket_send() and mac_check_socket_receive()
checks from the MAC tree: allow policies to perform access control
for the ability of a process to send and receive data via a socket.
At some point, we might also pass in additional address information
if an explicit address is requested on send.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-06 14:39:15 +00:00
rwatson
2ad996a2d3 Sync from MAC tree: break out the single mmap entry point into
seperate entry points for each occasion:

mac_check_vnode_mmap()		Check at initial mapping
mac_check_vnode_mprotect()	Check at mapping protection change
mac_check_vnode_mmap_downgrade()	Determine if a mapping downgrade
					should take place following
					subject relabel.

Implement mmap() and mprotect() entry points for labeled vnode
policies.  These entry points are currently not hooked up to the
VM system in the base tree.  These changes improve the consistency
of the access control interface and offer more flexibility regarding
limiting access to vnode mmaping.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-06 02:46:26 +00:00
rwatson
74ec128a1c Modify label allocation semantics for sockets: pass in soalloc's malloc
flags so that we can call malloc with M_NOWAIT if necessary, avoiding
potential sleeps while holding mutexes in the TCP syncache code.
Similar to the existing support for mbuf label allocation: if we can't
allocate all the necessary label store in each policy, we back out
the label allocation and fail the socket creation.  Sync from MAC tree.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 21:23:47 +00:00
rwatson
eb55ee8585 Make sure that the accounting credential is saved along with the vp
when accounting is suspended--otherwise when accounting is restored,
we may incorrectly assume the credential is valid.

Panics experienced by:	juli
2002-10-05 20:05:23 +00:00
rwatson
7b150b70c2 Integrate a devfs/MAC fix from the MAC tree: avoid a race condition during
devfs VOP symlink creation by introducing a new entry point to determine
the label of the devfs_dirent prior to allocation of a vnode for the
symlink.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 18:40:10 +00:00
rwatson
abda58cc1e Merge support for mac_check_vnode_link(), a MAC framework/policy entry
point that instruments the creation of hard links.  Policy implementations
to follow.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 18:11:36 +00:00
rwatson
d273cfe761 While the MAC API has supported the ability to handle M_NOWAIT passed
to mbuf label initialization, that functionality was never merged to
the main tree.  Go ahead and merge that functionality now.  Note that
this requires policy modules to accept the case where the label
element may be destroyed even if init has not succeeded on it (in
the event that policy failed the init).  This will shortly also
apply to sockets.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 17:44:49 +00:00
rwatson
7a8226480f Rearrange object and label init/destroy functions to match the
order used in mac_policy.h and elsewhere.  Sort order is basically
"by operation category", then "alphabetically by object". Sync to
MAC tree.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 17:38:45 +00:00
rwatson
5669cfde80 Sync to MAC tree: use 'flag' instead of 'how' for mac_init_mbuf();
remove a slightly less than useful comment.
2002-10-05 17:18:43 +00:00
green
7dad395c0e Don't allow dev_stdclone(9) to accept minors larger than the system is
able to handle (0xffffff).
2002-10-05 17:10:28 +00:00
rwatson
7c754b7adc Another big diff, little functional change: move label internalization,
externalization, and cred label life cycle events to entirely above
devfs and vnode events.  Sync from MAC tree.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 16:57:16 +00:00
rwatson
aece5c85f0 Move all object label init/destroy routines to the head of the
entry points to better match the entry point ordering in mac_policy.h.
Big diff, no functional change; merge from the MAC tree.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 16:54:59 +00:00
rwatson
c98d753496 Synch from TrustedBSD MAC tree:
- If a policy isn't registered when a policy module unloads, silently
  succeed.

- Hold the policy list lock across more of the validity tests to avoid
  races.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 16:46:03 +00:00
phk
951c3e53b2 NB: This commit does *NOT* make GEOM the default in FreeBSD
NB: But it will enable it in all kernels not having options "NO_GEOM"

Put the GEOM related options into the intended order.

Add "options NO_GEOM" to all kernel configs apart from NOTES.

In some order of controlled fashion, the NO_GEOM options will be
removed, architecture by architecture in the coming days.

There are currently three known issues which may force people to
need the NO_GEOM option:

boot0cfg/fdisk:
        Tries to update the MBR while it is being used to control
        slices.  GEOM does not allow this as a direct operation.

SCSI floppy drives:
        Appearantly the scsi-da driver return "EBUSY" if no media
        is inserted.  This is wrong, it should return ENXIO.

PC98:
        It is unclear if GEOM correctly recognizes all variants of
        PC98 disklabels.  (Help Wanted!  I have neither docs nor HW)

These issues are all being worked.

Sponsored by:	DARPA & NAI Labs.
2002-10-05 16:35:33 +00:00
rwatson
ca4946005d Cosmetic line wrap synchronization. 2002-10-05 16:33:46 +00:00
rwatson
8cc4bbaa82 Push the debugging obect label counters into security.mac.debug.counters
rather than directly under security.mac.debug.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 16:30:53 +00:00
rwatson
2670ddfd3d Begin another merge from the TrustedBSD MAC branch:
- Change mpo_init_foo(obj, label) and mpo_destroy_foo(obj, label) policy
  entry points to mpo_init_foo_label(label) and
  mpo_destroy_foo_label(label).  This will permit the use of the same
  entry points for holding temporary type-specific label during
  internalization and externalization, as well as for caching purposes.
- Because of this, break out mpo_{init,destroy}_socket() and
  mpo_{init,destroy}_mount() into seperate entry points for socket
  main/peer labels and mount main/fs labels.
- Since the prototype for label initialization is the same across almost
  all entry points, implement these entry points using common
  implementations for Biba, MLS, and Test, reducing the number of
  almost identical looking functions.

This simplifies policy implementation, as well as preparing us for the
merge of the new flexible userland API for managing labels on objects.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 15:10:00 +00:00
sobomax
18d9db4bb5 Fix problem introduced in rev.1.406, which can cause already unlocked
mutex being unlocked again causing system panic.
2002-10-05 12:56:10 +00:00
brian
89d8005f01 If dsgetlabel() returns a label with a size of zero in diskdumpconf(),
treat it as an invalid partition.

This fixes a bug where ``dumpon <device>'' will configure the dump
device at a random offset on the disk if <device> isn't a valid
partition.

Reviewed by: phk
2002-10-05 11:24:21 +00:00
jmallett
2061c32a16 Put an easy-to-miss assignment into the proper place. It was stray in the
middle of a block of code, with no clear assignment.  While here, move one
nearby assignment out of declaration.
2002-10-05 04:49:46 +00:00
jmallett
dbf91abc13 Remove bogus duplicate assignment of local variables. 2002-10-05 04:35:59 +00:00
phk
ae4fc4e63e Add the new function "sbuf_done()" which returns non-zero if the sbuf is
finished.

This allows sbufs to be used for request/response scenarioes without
needing additional communication flags.

Sponsored by:	DARPA & NAI Labs.
2002-10-04 09:58:17 +00:00
peter
978d530510 Add some unspeakable hackery to the tree under #ifdef __ia64__ to work
around limitations in the ia64 kernel stack handling code.  Basically
preallocate a bunch of threads (and hence kstacks) while contigmalloc()
still works, and never free them back to the general memory pool.  After
the system has been running for a while, contigmalloc() eventually fails
at a critical momemt and panics the system.
2002-10-04 01:31:39 +00:00
truckman
8ad27d9922 hashinit() calls MALLOC(), so release the filedesc lock in knote_attach()
before calling hashinit() and relock afterwards, taking care to see that
we don't lose a race.
2002-10-03 06:03:26 +00:00
jmallett
2eba9107fd XXX Add a check for p->p_limit being NULL before dereferencing it. This is
totally bogus but will hide the occurances of access of 0xbc(NULL) which
people have run into lately.  This is not a proper fix, just a bandaid, until
the cause of this happening is tracked down and fixed.

Reviewed by:	rwatson
2002-10-03 04:09:00 +00:00
truckman
da2757cbc5 In an SMP environment post-Giant it is no longer safe to blindly
dereference the struct sigio pointer without any locking.  Change
fgetown() to take a reference to the pointer instead of a copy of the
pointer and call SIGIO_LOCK() before copying the pointer and
dereferencing it.

Reviewed by:	rwatson
2002-10-03 02:13:00 +00:00
davidxu
d86ebf792e set ke_bound to NULL when kse owner thread becomes runnable.
Reviewed by: julian (mentor)
2002-10-03 01:22:05 +00:00
julian
bcda4f654e Whitespace fix only 2002-10-02 23:12:01 +00:00
jhb
8c9a393a04 Rename the mutex thread and process states to use a more generic 'LOCK'
name instead.  (e.g., SLOCK instead of SMTX, TD_ON_LOCK() instead of
TD_ON_MUTEX())  Eventually a turnstile abstraction will be added that
will be shared with mutexes and other types of locks.  SLOCK/TDI_LOCK will
be used internally by the turnstile code and will not be specific to
mutexes.  Making the change now ensures that turnstiles can be dropped
in at a later date without affecting the ABI of userland applications.
2002-10-02 20:31:47 +00:00
jmallett
cd6e4d8c7c Access td->td_kse inside sched_lock.
Submitted by:	julian
2002-10-02 18:25:09 +00:00
archie
9301eb9484 Let kse_wakeup() take a KSE mailbox pointer argument.
Reviewed by:	julian
2002-10-02 16:48:16 +00:00
jmallett
056df6de99 De-obfuscate local use of members of 'struct thread', for which we have
local variables, and group assignment.
2002-10-02 16:39:39 +00:00
phk
feb8c6a405 Absorb <sys/bus_private.h> into kern/subr_bus.c to prevent misunderstandings.
Suggested by:	bde
Approved by:	dfr
2002-10-02 09:34:29 +00:00
phk
76d8452fbf Fix mis-indentation.
Spotted by:	FlexeLint
2002-10-02 09:09:25 +00:00
scottl
3a150bca9c Some kernel threads try to do significant work, and the default KSTACK_PAGES
doesn't give them enough stack to do much before blowing away the pcb.
This adds MI and MD code to allow the allocation of an alternate kstack
who's size can be speficied when calling kthread_create.  Passing the
value 0 prevents the alternate kstack from being created.  Note that the
ia64 MD code is missing for now, and PowerPC was only partially written
due to the pmap.c being incomplete there.
Though this patch does not modify anything to make use of the alternate
kstack, acpi and usb are good candidates.

Reviewed by:	jake, peter, jhb
2002-10-02 07:44:29 +00:00
rwatson
4be0d09ad3 Add a new MAC entry point, mac_thread_userret(td), which permits policy
modules to perform MAC-related events when a thread returns to user
space.  This is required for policies that have floating process labels,
as it's not always possible to acquire the process lock at arbitrary
points in the stack during system call processing; process labels might
represent traditional authentication data, process history information,
or other data.

LOMAC will use this entry point to perform the process label update
prior to the thread returning to userspace, when plugged into the MAC
framework.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-02 02:42:38 +00:00
jmallett
7a693db242 Back our kernel support for reliable signal queues.
Requested by:	rwatson, phk, and many others
2002-10-01 17:15:53 +00:00
jhb
cb9df84644 Minor style nits in a comment. 2002-10-01 15:49:32 +00:00
phk
b55fa4540e Fix some harmless mis-indents.
Spotted by:	FlexeLint
2002-10-01 15:48:31 +00:00
phk
32912489c9 Remember to include "opt_devfs.h" so we get any relevant changes
to NDEVFSINO before we include devfs.h.

Spotted by:	FlexeLint
2002-10-01 15:24:35 +00:00
jhb
014597718b Various style fixups.
Submitted by:	bde (mostly)
2002-10-01 14:16:50 +00:00
jhb
a1ef3e37b0 Actually clear PS_XCPU in ast() when we handle it.
Submitted by:	bde
Pointy hat to:	jhb
2002-10-01 14:13:13 +00:00
jhb
02678fb50e - Adjust comment noting that handling of CPU limit exhaustion is done in
ast().
- Actually set KEF_ASTPENDING so ast() is called.  I think this is buggy
  for a process with multiple KSE's in that PS_XCPU is not a KSE event,
  it's a process-wide event.  IMO there really should probably be two
  ASTPENDING flags, one for per-process, and one for per-KSE.

Submitted by:	bde
2002-10-01 14:10:08 +00:00
phk
5354d80714 Don't #error if we are lint. 2002-10-01 13:15:11 +00:00
phk
19150ba4f8 Split MBR and PC98 on-disk sliceformats out from disklabel.h, step 1:
Peter had repocopied sys/disklabel.h to sys/diskpc98.h and sys/diskmbr.h.

These two new copies are still intact copies of disklabel.h and
therefore protected by #ifndef _SYS_DISKLABEL_H_ so #including them
in programs which already include <sys.disklabel.h> is currently a
no-op.

This commit adds a number of such #includes.

Once I have verified that I have fixed all the places which need fixing,
I will commit the updated versions of the three #include files.

Sponsored by:   DARPA & NAI Labs.
2002-10-01 07:24:55 +00:00
rwatson
0b0e14e462 Improve locking of pipe mutexes in the context of MAC:
(1) Where previously the pipe mutex was selectively grabbed during
    pipe_ioctl(), now always grab it and then release if if not
    needed.  This protects the call to mac_check_pipe_ioctl() to
    make sure the label remains consistent.  (Note: it looks
    like sigio locking may be incorrect for fgetown() since we
    call it not-by-reference and sigio locking assumes call by
    reference).

(2) In pipe_stat(), lock the pipe if MAC is compiled in so that
    the call to mac_check_pipe_stat() gets a locked pipe to
    protect label consistency.  We still release the lock before
    returning actual stat() data, risking inconsistency, but
    apparently our pipe locking model accepts that risk.

(3) In various pipe MAC authorization checks, assert that the pipe
    lock is held.

(4) Grab the lock when performing a pipe relabel operation, and
    assert it a little deeper in the stack.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-01 04:30:19 +00:00
rwatson
d95d2f1aae Push 'security.mac.debug_label_fallback' behind options MAC_DEBUG.
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-10-01 03:24:20 +00:00
jmallett
7d2081be83 Until I find a way to release arbitrary locks held when sending signals (there
really should not be some), use the M_NOWAIT flag to malloc(9), and panic(9)
if malloc(9) fails.
2002-10-01 03:19:49 +00:00
rwatson
40b01ec743 Regen. 2002-10-01 02:37:35 +00:00
rwatson
0e55f4c4ed Reserve system call numbers for the following system calls:
__mac_get_pid		Retrieve MAC label of a process by pid

Similar to __mac_get_proc() except that the target process of
the operation is explicitly specified rather than assuming
curthread.

__mac_get_link		Retrieve MAC label of a path with NOFOLLOW
__mac_set_link		Set MAC label of a path with NOFOLLOW
extattr_set_link	Set EAs on a path with NOFOLLOW
extattr_get_link	Retrieve EAs on a path with NOFOLLOW
extattr_delete_link	Delete EAs on a path with NOFOLLOW

These calls are similar to __mac_get_file(), __mac_set_file(),
extattr_set_file(), extattr_get_file(), and extattr_delete_file(),
except that they do not follow symlinks.  The distinction between
these calls is similar to lchown() vs chown().

Implementations to follow.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-01 02:35:59 +00:00
jmallett
8ff3150660 Back out code changes that snuck into the previous forced commit. 2002-10-01 00:16:17 +00:00
jmallett
a8d86705cf (Forced commit, to clarify previous commit of ksiginfo/signal queue code.)
I've added a structure, kernel-private, to represent a pending or in-delivery
signal, called `ksiginfo'.  It is roughly analogous to the basic information
that is exported by the POSIX interface 'siginfo_t', but more basic.  I've
added functions to allocate these structures, and further to wrap all signal
operations using them.

Once the operations are wrapped, I've added a TailQ (see queue(3)) of these
structures to 'struct proc', and all pending signals are in that TailQ.  When
a signal is being delivered, it is dequeued from the list.  Once I finish
the spreading of ksiginfo throughout the tree, the dequeued structure will be
delivered to the process in question, whereas currently and normally, the
signal number is what is used.
2002-10-01 00:07:28 +00:00
jhb
f350519764 - Add a new per-process flag PS_XCPU to indicate that at least one thread
has exceeded its CPU time limit.
- In mi_switch(), set PS_XCPU when the CPU time limit is exceeded.
- Perform actual CPU time limit exceeded work in ast() when PS_XCPU is set.

Requested by:	many
2002-09-30 21:13:54 +00:00
jhb
f72526c16f Change p_cpulimit to be in seconds instead of microseconds. Since
p_runtime now is a bintime, it is no longer an optimization to store
p_cpulimit as microseconds.

Suggested by:	phk
2002-09-30 21:08:38 +00:00
rwatson
5d5060bddf Move vnode MAC label initialization to after the release of the vnode
interlock in getnewvnode() to avoid possible sleeps while holding
the mutex.  Note that the warning from Witness is a slight false
positive since we know there will be no contention on the interlock
since we haven't made the vnode available for use yet, but the theory
is not a bad one.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-09-30 20:51:48 +00:00
rwatson
731b954aba Add tunables for the existing sysctl twiddles for pipe and vm
enforcement so they can be disabled prior to kernel start.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-09-30 20:50:00 +00:00
jmallett
0341f71df1 First half of implementation of ksiginfo, signal queues, and such. This
gets signals operating based on a TailQ, and is good enough to run X11,
GNOME, and do job control.  There are some intricate parts which could be
more refined to match the sigset_t versions, but those require further
evaluation of directions in which our signal system can expand and contract
to fit our needs.

After this has been in the tree for a while, I will make in kernel API
changes, most notably to trapsignal(9) and sendsig(9), to use ksiginfo
more robustly, such that we can actually pass information with our
(queued) signals to the userland.  That will also result in using a
struct ksiginfo pointer, rather than a signal number, in a lot of
kern_sig.c, to refer to an individual pending signal queue member, but
right now there is no defined behaviour for such.

CODAFS is unfinished in this regard because the logic is unclear in
some places.

Sponsored by:	New Gold Technology
Reviewed by:	bde, tjr, jake [an older version, logic similar]
2002-09-30 20:20:22 +00:00
phk
636cee6b01 Plug memory leaks.
Detected by:	FlexeLint
Approved by:	jhb
2002-09-30 19:19:47 +00:00
julian
fbf94f64b8 uh, commit all of the patch 2002-09-29 23:28:58 +00:00
julian
bac3b741a4 commit the version I actually tested..
Submitted by:	davidxu
2002-09-29 23:23:25 +00:00
julian
d91c37553e Implement basic KSE loaning. This stops a hread that is blocked in BOUND mode
from stopping another thread from completing a syscall, and this allows it to
release its resources etc. Probably more related commits to follow (at least
one I know of)

Initial concept by: julian, dillon
Submitted by:	davidxu
2002-09-29 23:04:34 +00:00
obrien
30b02a2de2 Fix style nit where conditionally compiled code was unconditionalized,
but style(9) was consulted.

Submitted by:	bde
2002-09-29 04:47:41 +00:00
julian
71a47fc4fb lock proc while calling psignal
(plus related cleanups)

Submitted by:	davidxu
2002-09-29 02:48:37 +00:00
phk
1265230747 Move includ of <sys/bus_priate.h> later to get semantic identity of
device_t the same throughout kernel.

This is a very fine point of C which fortunatly does not make any
difference in normal circumstances but which due to the pervasiveness
of device_t in the kernel can make a lint barf a lot.
2002-09-28 21:38:35 +00:00
phk
a4c01edbf3 Change a return to a break so the local buffers get properly freeed.
Spotte by:	FlexeLint

Reviewed by:	rwatson
2002-09-28 21:34:31 +00:00
phk
c5be3924c3 Remove unused includes.
Clarify the intention of a while();
Move a local variable to avoid potential name-confusion.
2002-09-28 17:46:30 +00:00
phk
1dfc2c167f Be consistent about "static" functions: if the function is marked
static in its prototype, mark it static at the definition too.

Inspired by:    FlexeLint warning #512
2002-09-28 17:15:38 +00:00
phk
107a67d351 Correctly order VI_UNLOCK(), local variables and block comment. 2002-09-28 12:15:44 +00:00
julian
e8d03df6f1 Rewrite the kse_create() function to better aproach the semantics we
have specified in the design.
2002-09-28 08:44:31 +00:00
jake
c03e93ab3a Add a workaround for what seems to be confusion between binutils and the
sparc v9 ABI.  The Elf_Rela records for local symbols appear to already
have the symbol's value added in to the addend field, even though the ABI
specifies we need to lookup the symbol and add its value too.  This breaks
text relocations in klds because the symbol's value is added twice, and
the resulting address points off into nowhere land, so for now just use
the addend.

Tested by:	rwatson
2002-09-27 23:12:53 +00:00
phk
834ddd1bcc Rename struct specinfo to the more appropriate struct cdev.
Agreed on:	jake, rwatson, jhb
2002-09-27 18:27:10 +00:00
julian
913026ad48 Redo how completing threads pass their state to userland
if they are not going to cross over themselves. Also change how the list of
completed user threads is tracked and passed to the KSE. This is not
a change in design but rather the implementation of what was originally
envisionned.
2002-09-27 07:11:11 +00:00
phk
ccb0271ad3 Under DIAGNOSTIC, complain if ENOIOCTL leaks out through VOP_IOCTL(). 2002-09-26 21:21:13 +00:00
phk
9eadc24323 Make biowait() check bio_error before the BIO_ERROR flag, to propery
catch internal GEOM use of bio_error.

Sponsored by:	DARPA & NAI Labs.
2002-09-26 16:32:14 +00:00
jeff
536752d481 - Export the alq daemon thread pointer.
- Don't log ktr events from the alq daemon.
2002-09-26 07:38:56 +00:00
jeff
31b1ddae74 - Move ASSERT_VOP_*LOCK* functionality into functions in vfs_subr.c
- Make the VI asserts more orthogonal to the rest of the asserts by using a
   new, common vfs_badlock() function and adding a 'str' arg.
 - Adjust generated ASSERTS to match the new prototype.
 - Adjust explicit ASSERTS to match the new prototype.
2002-09-26 04:48:44 +00:00
jeff
3a58c4d63e - We don't need any automated lock checking for vop_islocked. 2002-09-26 00:31:16 +00:00
archie
904b65e85d Make the following name changes to KSE related functions, etc., to better
represent their purpose and minimize namespace conflicts:

	kse_fn_t		-> kse_func_t
	struct thread_mailbox	-> struct kse_thr_mailbox
	thread_interrupt()	-> kse_thr_interrupt()
	kse_yield()		-> kse_release()
	kse_new()		-> kse_create()

Add missing declaration of kse_thr_interrupt() to <sys/kse.h>.
Regenerate the various generated syscall files. Minor style fixes.

Reviewed by:	julian
2002-09-25 18:10:42 +00:00
bde
23905c1c46 Round up instead of towards 0 in clock_getres() so that a resolution of
0 is never returned.

PR:		41781
MFC after:	3 days
2002-09-25 12:00:38 +00:00
jeff
ee7cd9172d - Lock down the syncer with sync_mtx.
- Enable vfs_badlock_mutex by default.
 - Assert that the vp is locked in VOP_UNLOCK.
 - Use standard interlock macros in remaining code.
 - Correct a race in getnewvnode().
 - Lock access to v_numoutput with interlock.
 - Lock access to buf lists and splay tree with interlock.
 - Add VOP and VI asserts.
 - Lock b_vnbufs with the vnode interlock.
 - Add vrefcnt() for callers who want to retreive the vnode ref without
   holding a lock.  Add a comment that describes when this is safe.
 - Add vholdl() and vdropl() so that callers who already own the interlock
   can avoid race conditions and unnecessary unlocking.
 - Move the VOP_GETATTR() in vflush() into the WRITECLOSE conditional case.
 - Hold the interlock before droping the mntlist_mtx in vflush() to avoid
   a race.
 - Fix locking in vfs_msync().
2002-09-25 02:22:21 +00:00
jeff
881a59ab9e - Properly lock v_vflags in getdirents(). 2002-09-25 02:13:38 +00:00
jeff
8ec2a2de7d - Use incore() where no other interlock locking is necessary.
- Lock access to numoutput.
2002-09-25 02:12:32 +00:00
jeff
54956e8ea4 - Lock accesses to v_numoutput.
- Lock calls to gbincore.
2002-09-25 02:11:37 +00:00
jeff
fb08412291 - Don't protect mountedhere with the vn interlock.
- Protect mountedhere with the vn lock.
2002-09-25 01:44:21 +00:00
jeff
0649189fd7 - Use the standard vp interlock macros. 2002-09-25 01:42:24 +00:00
julian
e839899222 Don't use local variable 'p' in a debug statement.. we removed it. 2002-09-23 14:06:12 +00:00
julian
14621f8ead oops don't do dthe copy range in a new KSE. There isn't one any more. 2002-09-23 14:01:01 +00:00
julian
bcb38a31ff slightly clean up the thread_userret() and thread_consider_upcall() calls.
also some slight changes for TDF_BOUND testing and small style changes
Should ONLY affect KSE programs

Submitted by:	davidxu
2002-09-23 06:14:30 +00:00
julian
5f1e8c6326 Add code to create > 1 KSe per process.
(support code not yet complete)

Submitted by:	davidxu
2002-09-23 06:10:24 +00:00
julian
48fb4349d0 Indentation does not define a block.. you need breces {} as well..
also add a mutex assert.  (threaded path only)

Submitted by:	davidxu
2002-09-23 05:27:30 +00:00
jeff
4b760e8f27 - Hold the credential of the caller and use it in all subsequent vn ops.
- Get rid of the ill conceived aq_td field.

Suggested by:	rwatson
2002-09-23 05:20:00 +00:00
jeff
f1712a0c27 - Add support for logging KTR via ALQ. This is optional and enabled by the
KTR_ALQ config option.
2002-09-22 07:13:45 +00:00
jeff
4e382c3af5 - Tell witness about ALQ's spin lock. 2002-09-22 07:11:57 +00:00