Commit Graph

5614 Commits

Author SHA1 Message Date
Robert Watson
2dba710ddb Incremental style improvements: more consistently avoid assignments
in conditionals; remove some excess vertical whitespace; remove a
bug in the return handling of the delete_vp() case for MAC.

Spotted by:	bde
2002-10-10 13:59:58 +00:00
Robert Watson
16c26e60ef Regen from syntax fix to syscalls.master.
PR:
Submitted by:
Reviewed by:
Approved by:
Obtained from:
MFC after:
2002-10-10 04:08:11 +00:00
Robert Watson
3c4aba09e3 Fix what looks like a merge-o from a conflict in the last commit to
syscalls.master.
2002-10-10 04:02:49 +00:00
Robert Watson
b101411be1 Explore new heights in alphabetization for _file and _fd variations on
the extended attribute system calls.
2002-10-10 00:32:08 +00:00
Peter Wemm
0d66d36f44 Add a pointer to the alternate syscall tables on 64 bit platforms. 2002-10-09 22:04:09 +00:00
Robert Watson
6f90723cad Implement extattr_{delete,get,set}_link() system calls: extended attribute
operations that do not follow links.  Sync to MAC tree.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-09 21:48:22 +00:00
Robert Watson
233d463548 Regen. 2002-10-09 21:47:29 +00:00
Robert Watson
8b10835c35 Flesh out the extattr_{delete,get,set}_link() system calls: variations
on the _file() theme that do not follow symlinks.  Sync to MAC tree.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-09 21:47:04 +00:00
John Baldwin
5715307f74 - Move p_cpulimit to struct proc from struct plimit and protect it with
sched_lock.  This means that we no longer access p_limit in mi_switch()
  and the p_limit pointer can be protected by the proc lock.
- Remove PRS_ZOMBIE check from CPU limit test in mi_switch().  PRS_ZOMBIE
  processes don't call mi_switch(), and even if they did there is no longer
  the danger of p_limit being NULL (which is what the original zombie check
  was added for).
- When we bump the current processes soft CPU limit in ast(), just bump the
  private p_cpulimit instead of the shared rlimit.  This fixes an XXX for
  some value of fix.  There is still a (probably benign) bug in that this
  code doesn't check that the new soft limit exceeds the hard limit.

Inspired by:	bde (2)
2002-10-09 17:17:24 +00:00
Julian Elischer
48bfcddd94 Round out the facilty for a 'bound' thread to loan out its KSE
in specific situations. The owner thread must be blocked, and the
borrower can not proceed back to user space with the borrowed KSE.
The borrower will return the KSE on the next context switch where
teh owner wants it back. This removes a lot of possible
race conditions and deadlocks. It is consceivable that the
borrower should inherit the priority of the owner too.
that's another discussion and would be simple to do.

Also, as part of this, the "preallocatd spare thread" is attached to the
thread doing a syscall rather than the KSE. This removes the need to lock
the scheduler when we want to access it, as it's now "at hand".

DDB now shows a lot mor info for threaded proceses though it may need
some optimisation to squeeze it all back into 80 chars again.
(possible JKH project)

Upcalls are now "bound" threads, but "KSE Lending" now means that
other completing syscalls can be completed using that KSE before the upcall
finally makes it back to the UTS. (getting threads OUT OF THE KERNEL is
one of the highest priorities in the KSE system.) The upcall when it happens
will present all the completed syscalls to the KSE for selection.
2002-10-09 02:33:36 +00:00
Warner Losh
0b294f891d Introducing /dev/devctl. This device reports events in the
configuration device hierarchy.  Device arrival, departure and not
matched are presently reported.  This will be the basis for devd, which
I still need to polish a little more before I commit it.  If you don't
use /dev/devctl, it will be a noop.
2002-10-07 23:17:44 +00:00
Warner Losh
c17fdbe3a9 Two minor bugfixes:
o Allow the bus_debug variable to be set via the bus.debug tunable.
o Return pnpinfo and location info via the devinfo interface to userland.
  devinfo(8) needs to be updated to print it.
2002-10-07 23:15:40 +00:00
Ian Dowse
197b023b1b Add back a fdrop() call at the end of kern_open() that got lost in
revision 1.218. This bug caused a "struct file" reference to be
leaked if VOP_ADVLOCK(), vn_start_write(), or mac_check_vnode_write()
failed during the open operation.

PR:		kern/43739
Reported by:	Arne Woerner <woerner@mediabase-gmbh.de>
2002-10-07 20:49:22 +00:00
Warner Losh
0a1d3ef9b8 Add wrappers around the newly created bus_child_pnpinfo_str and
bus_child_location_str.
2002-10-07 07:08:00 +00:00
Warner Losh
d71dec96bf Minor string handling cleanup that I've had in my tree for a while:
Don't use snprintf where strlcpy() will do the job.
Also, a NUL is '\0' not 0 in our style (C doesn't care), so spell it like.
Remove useless {} and () in the general area of this change.
2002-10-07 06:50:35 +00:00
Warner Losh
da7b83f9ea Don't need to NUL terminate after snprintf 2002-10-07 06:26:17 +00:00
Warner Losh
3d9841b4eb Add two interfaces to allow for busses to report the pnpinfo for
devices as well as their location on the bus.
2002-10-07 05:06:38 +00:00
Alfred Perlstein
c814aa3fdb disable debug output by default. 2002-10-07 04:13:21 +00:00
Robert Watson
b371c939ce Integrate mac_check_socket_send() and mac_check_socket_receive()
checks from the MAC tree: allow policies to perform access control
for the ability of a process to send and receive data via a socket.
At some point, we might also pass in additional address information
if an explicit address is requested on send.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-06 14:39:15 +00:00
Robert Watson
e183f80e54 Sync from MAC tree: break out the single mmap entry point into
seperate entry points for each occasion:

mac_check_vnode_mmap()		Check at initial mapping
mac_check_vnode_mprotect()	Check at mapping protection change
mac_check_vnode_mmap_downgrade()	Determine if a mapping downgrade
					should take place following
					subject relabel.

Implement mmap() and mprotect() entry points for labeled vnode
policies.  These entry points are currently not hooked up to the
VM system in the base tree.  These changes improve the consistency
of the access control interface and offer more flexibility regarding
limiting access to vnode mmaping.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-06 02:46:26 +00:00
Robert Watson
83985c267e Modify label allocation semantics for sockets: pass in soalloc's malloc
flags so that we can call malloc with M_NOWAIT if necessary, avoiding
potential sleeps while holding mutexes in the TCP syncache code.
Similar to the existing support for mbuf label allocation: if we can't
allocate all the necessary label store in each policy, we back out
the label allocation and fail the socket creation.  Sync from MAC tree.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 21:23:47 +00:00
Robert Watson
b497ca81d6 Make sure that the accounting credential is saved along with the vp
when accounting is suspended--otherwise when accounting is restored,
we may incorrectly assume the credential is valid.

Panics experienced by:	juli
2002-10-05 20:05:23 +00:00
Robert Watson
74e62b1b75 Integrate a devfs/MAC fix from the MAC tree: avoid a race condition during
devfs VOP symlink creation by introducing a new entry point to determine
the label of the devfs_dirent prior to allocation of a vnode for the
symlink.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 18:40:10 +00:00
Robert Watson
0a69419678 Merge support for mac_check_vnode_link(), a MAC framework/policy entry
point that instruments the creation of hard links.  Policy implementations
to follow.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 18:11:36 +00:00
Robert Watson
56c1541237 While the MAC API has supported the ability to handle M_NOWAIT passed
to mbuf label initialization, that functionality was never merged to
the main tree.  Go ahead and merge that functionality now.  Note that
this requires policy modules to accept the case where the label
element may be destroyed even if init has not succeeded on it (in
the event that policy failed the init).  This will shortly also
apply to sockets.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 17:44:49 +00:00
Robert Watson
87807196f8 Rearrange object and label init/destroy functions to match the
order used in mac_policy.h and elsewhere.  Sort order is basically
"by operation category", then "alphabetically by object". Sync to
MAC tree.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 17:38:45 +00:00
Robert Watson
a931e345a9 Sync to MAC tree: use 'flag' instead of 'how' for mac_init_mbuf();
remove a slightly less than useful comment.
2002-10-05 17:18:43 +00:00
Brian Feldman
dab3d85fd7 Don't allow dev_stdclone(9) to accept minors larger than the system is
able to handle (0xffffff).
2002-10-05 17:10:28 +00:00
Robert Watson
69bbb5b1c7 Another big diff, little functional change: move label internalization,
externalization, and cred label life cycle events to entirely above
devfs and vnode events.  Sync from MAC tree.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 16:57:16 +00:00
Robert Watson
08bcdc586e Move all object label init/destroy routines to the head of the
entry points to better match the entry point ordering in mac_policy.h.
Big diff, no functional change; merge from the MAC tree.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 16:54:59 +00:00
Robert Watson
ea599aa018 Synch from TrustedBSD MAC tree:
- If a policy isn't registered when a policy module unloads, silently
  succeed.

- Hold the policy list lock across more of the validity tests to avoid
  races.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 16:46:03 +00:00
Poul-Henning Kamp
3bd6561289 NB: This commit does *NOT* make GEOM the default in FreeBSD
NB: But it will enable it in all kernels not having options "NO_GEOM"

Put the GEOM related options into the intended order.

Add "options NO_GEOM" to all kernel configs apart from NOTES.

In some order of controlled fashion, the NO_GEOM options will be
removed, architecture by architecture in the coming days.

There are currently three known issues which may force people to
need the NO_GEOM option:

boot0cfg/fdisk:
        Tries to update the MBR while it is being used to control
        slices.  GEOM does not allow this as a direct operation.

SCSI floppy drives:
        Appearantly the scsi-da driver return "EBUSY" if no media
        is inserted.  This is wrong, it should return ENXIO.

PC98:
        It is unclear if GEOM correctly recognizes all variants of
        PC98 disklabels.  (Help Wanted!  I have neither docs nor HW)

These issues are all being worked.

Sponsored by:	DARPA & NAI Labs.
2002-10-05 16:35:33 +00:00
Robert Watson
226b96fb6d Cosmetic line wrap synchronization. 2002-10-05 16:33:46 +00:00
Robert Watson
b2f0927ad6 Push the debugging obect label counters into security.mac.debug.counters
rather than directly under security.mac.debug.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 16:30:53 +00:00
Robert Watson
96adb90996 Begin another merge from the TrustedBSD MAC branch:
- Change mpo_init_foo(obj, label) and mpo_destroy_foo(obj, label) policy
  entry points to mpo_init_foo_label(label) and
  mpo_destroy_foo_label(label).  This will permit the use of the same
  entry points for holding temporary type-specific label during
  internalization and externalization, as well as for caching purposes.
- Because of this, break out mpo_{init,destroy}_socket() and
  mpo_{init,destroy}_mount() into seperate entry points for socket
  main/peer labels and mount main/fs labels.
- Since the prototype for label initialization is the same across almost
  all entry points, implement these entry points using common
  implementations for Biba, MLS, and Test, reducing the number of
  almost identical looking functions.

This simplifies policy implementation, as well as preparing us for the
merge of the new flexible userland API for managing labels on objects.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-05 15:10:00 +00:00
Maxim Sobolev
790a8088d0 Fix problem introduced in rev.1.406, which can cause already unlocked
mutex being unlocked again causing system panic.
2002-10-05 12:56:10 +00:00
Brian Somers
52ae0b7fb5 If dsgetlabel() returns a label with a size of zero in diskdumpconf(),
treat it as an invalid partition.

This fixes a bug where ``dumpon <device>'' will configure the dump
device at a random offset on the disk if <device> isn't a valid
partition.

Reviewed by: phk
2002-10-05 11:24:21 +00:00
Juli Mallett
0d29446006 Put an easy-to-miss assignment into the proper place. It was stray in the
middle of a block of code, with no clear assignment.  While here, move one
nearby assignment out of declaration.
2002-10-05 04:49:46 +00:00
Juli Mallett
ecafb24b41 Remove bogus duplicate assignment of local variables. 2002-10-05 04:35:59 +00:00
Poul-Henning Kamp
c5f9218b48 Add the new function "sbuf_done()" which returns non-zero if the sbuf is
finished.

This allows sbufs to be used for request/response scenarioes without
needing additional communication flags.

Sponsored by:	DARPA & NAI Labs.
2002-10-04 09:58:17 +00:00
Peter Wemm
c281972e61 Add some unspeakable hackery to the tree under #ifdef __ia64__ to work
around limitations in the ia64 kernel stack handling code.  Basically
preallocate a bunch of threads (and hence kstacks) while contigmalloc()
still works, and never free them back to the general memory pool.  After
the system has been running for a while, contigmalloc() eventually fails
at a critical momemt and panics the system.
2002-10-04 01:31:39 +00:00
Don Lewis
cb81d3ca4d hashinit() calls MALLOC(), so release the filedesc lock in knote_attach()
before calling hashinit() and relock afterwards, taking care to see that
we don't lose a race.
2002-10-03 06:03:26 +00:00
Juli Mallett
a723033a4d XXX Add a check for p->p_limit being NULL before dereferencing it. This is
totally bogus but will hide the occurances of access of 0xbc(NULL) which
people have run into lately.  This is not a proper fix, just a bandaid, until
the cause of this happening is tracked down and fixed.

Reviewed by:	rwatson
2002-10-03 04:09:00 +00:00
Don Lewis
91e97a8266 In an SMP environment post-Giant it is no longer safe to blindly
dereference the struct sigio pointer without any locking.  Change
fgetown() to take a reference to the pointer instead of a copy of the
pointer and call SIGIO_LOCK() before copying the pointer and
dereferencing it.

Reviewed by:	rwatson
2002-10-03 02:13:00 +00:00
David Xu
5da2b58aeb set ke_bound to NULL when kse owner thread becomes runnable.
Reviewed by: julian (mentor)
2002-10-03 01:22:05 +00:00
Julian Elischer
4162f2fe92 Whitespace fix only 2002-10-02 23:12:01 +00:00
John Baldwin
551cf4e150 Rename the mutex thread and process states to use a more generic 'LOCK'
name instead.  (e.g., SLOCK instead of SMTX, TD_ON_LOCK() instead of
TD_ON_MUTEX())  Eventually a turnstile abstraction will be added that
will be shared with mutexes and other types of locks.  SLOCK/TDI_LOCK will
be used internally by the turnstile code and will not be specific to
mutexes.  Making the change now ensures that turnstiles can be dropped
in at a later date without affecting the ABI of userland applications.
2002-10-02 20:31:47 +00:00
Juli Mallett
289e1e23d1 Access td->td_kse inside sched_lock.
Submitted by:	julian
2002-10-02 18:25:09 +00:00
Archie Cobbs
36a8dac10d Let kse_wakeup() take a KSE mailbox pointer argument.
Reviewed by:	julian
2002-10-02 16:48:16 +00:00
Juli Mallett
bc7b9f1dba De-obfuscate local use of members of 'struct thread', for which we have
local variables, and group assignment.
2002-10-02 16:39:39 +00:00
Poul-Henning Kamp
c56c20f13d Absorb <sys/bus_private.h> into kern/subr_bus.c to prevent misunderstandings.
Suggested by:	bde
Approved by:	dfr
2002-10-02 09:34:29 +00:00
Poul-Henning Kamp
8c5d013757 Fix mis-indentation.
Spotted by:	FlexeLint
2002-10-02 09:09:25 +00:00
Scott Long
316ec49abd Some kernel threads try to do significant work, and the default KSTACK_PAGES
doesn't give them enough stack to do much before blowing away the pcb.
This adds MI and MD code to allow the allocation of an alternate kstack
who's size can be speficied when calling kthread_create.  Passing the
value 0 prevents the alternate kstack from being created.  Note that the
ia64 MD code is missing for now, and PowerPC was only partially written
due to the pmap.c being incomplete there.
Though this patch does not modify anything to make use of the alternate
kstack, acpi and usb are good candidates.

Reviewed by:	jake, peter, jhb
2002-10-02 07:44:29 +00:00
Robert Watson
92dbb82a47 Add a new MAC entry point, mac_thread_userret(td), which permits policy
modules to perform MAC-related events when a thread returns to user
space.  This is required for policies that have floating process labels,
as it's not always possible to acquire the process lock at arbitrary
points in the stack during system call processing; process labels might
represent traditional authentication data, process history information,
or other data.

LOMAC will use this entry point to perform the process label update
prior to the thread returning to userspace, when plugged into the MAC
framework.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-02 02:42:38 +00:00
Juli Mallett
1d9c56964d Back our kernel support for reliable signal queues.
Requested by:	rwatson, phk, and many others
2002-10-01 17:15:53 +00:00
John Baldwin
feb2449610 Minor style nits in a comment. 2002-10-01 15:49:32 +00:00
Poul-Henning Kamp
8d3574c7a4 Fix some harmless mis-indents.
Spotted by:	FlexeLint
2002-10-01 15:48:31 +00:00
Poul-Henning Kamp
328048bc56 Remember to include "opt_devfs.h" so we get any relevant changes
to NDEVFSINO before we include devfs.h.

Spotted by:	FlexeLint
2002-10-01 15:24:35 +00:00
John Baldwin
6cae6dacd5 Various style fixups.
Submitted by:	bde (mostly)
2002-10-01 14:16:50 +00:00
John Baldwin
f6ccde8308 Actually clear PS_XCPU in ast() when we handle it.
Submitted by:	bde
Pointy hat to:	jhb
2002-10-01 14:13:13 +00:00
John Baldwin
1d56414515 - Adjust comment noting that handling of CPU limit exhaustion is done in
ast().
- Actually set KEF_ASTPENDING so ast() is called.  I think this is buggy
  for a process with multiple KSE's in that PS_XCPU is not a KSE event,
  it's a process-wide event.  IMO there really should probably be two
  ASTPENDING flags, one for per-process, and one for per-KSE.

Submitted by:	bde
2002-10-01 14:10:08 +00:00
Poul-Henning Kamp
fa15abd8a6 Don't #error if we are lint. 2002-10-01 13:15:11 +00:00
Poul-Henning Kamp
3bb24c35f2 Split MBR and PC98 on-disk sliceformats out from disklabel.h, step 1:
Peter had repocopied sys/disklabel.h to sys/diskpc98.h and sys/diskmbr.h.

These two new copies are still intact copies of disklabel.h and
therefore protected by #ifndef _SYS_DISKLABEL_H_ so #including them
in programs which already include <sys.disklabel.h> is currently a
no-op.

This commit adds a number of such #includes.

Once I have verified that I have fixed all the places which need fixing,
I will commit the updated versions of the three #include files.

Sponsored by:   DARPA & NAI Labs.
2002-10-01 07:24:55 +00:00
Robert Watson
1aa37f5392 Improve locking of pipe mutexes in the context of MAC:
(1) Where previously the pipe mutex was selectively grabbed during
    pipe_ioctl(), now always grab it and then release if if not
    needed.  This protects the call to mac_check_pipe_ioctl() to
    make sure the label remains consistent.  (Note: it looks
    like sigio locking may be incorrect for fgetown() since we
    call it not-by-reference and sigio locking assumes call by
    reference).

(2) In pipe_stat(), lock the pipe if MAC is compiled in so that
    the call to mac_check_pipe_stat() gets a locked pipe to
    protect label consistency.  We still release the lock before
    returning actual stat() data, risking inconsistency, but
    apparently our pipe locking model accepts that risk.

(3) In various pipe MAC authorization checks, assert that the pipe
    lock is held.

(4) Grab the lock when performing a pipe relabel operation, and
    assert it a little deeper in the stack.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-01 04:30:19 +00:00
Robert Watson
6be0c25e4e Push 'security.mac.debug_label_fallback' behind options MAC_DEBUG.
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-10-01 03:24:20 +00:00
Juli Mallett
7bf2a42fd5 Until I find a way to release arbitrary locks held when sending signals (there
really should not be some), use the M_NOWAIT flag to malloc(9), and panic(9)
if malloc(9) fails.
2002-10-01 03:19:49 +00:00
Robert Watson
d0bd8ced91 Regen. 2002-10-01 02:37:35 +00:00
Robert Watson
4499985ef2 Reserve system call numbers for the following system calls:
__mac_get_pid		Retrieve MAC label of a process by pid

Similar to __mac_get_proc() except that the target process of
the operation is explicitly specified rather than assuming
curthread.

__mac_get_link		Retrieve MAC label of a path with NOFOLLOW
__mac_set_link		Set MAC label of a path with NOFOLLOW
extattr_set_link	Set EAs on a path with NOFOLLOW
extattr_get_link	Retrieve EAs on a path with NOFOLLOW
extattr_delete_link	Delete EAs on a path with NOFOLLOW

These calls are similar to __mac_get_file(), __mac_set_file(),
extattr_set_file(), extattr_get_file(), and extattr_delete_file(),
except that they do not follow symlinks.  The distinction between
these calls is similar to lchown() vs chown().

Implementations to follow.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-10-01 02:35:59 +00:00
Juli Mallett
a88b260a86 Back out code changes that snuck into the previous forced commit. 2002-10-01 00:16:17 +00:00
Juli Mallett
226e1171e1 (Forced commit, to clarify previous commit of ksiginfo/signal queue code.)
I've added a structure, kernel-private, to represent a pending or in-delivery
signal, called `ksiginfo'.  It is roughly analogous to the basic information
that is exported by the POSIX interface 'siginfo_t', but more basic.  I've
added functions to allocate these structures, and further to wrap all signal
operations using them.

Once the operations are wrapped, I've added a TailQ (see queue(3)) of these
structures to 'struct proc', and all pending signals are in that TailQ.  When
a signal is being delivered, it is dequeued from the list.  Once I finish
the spreading of ksiginfo throughout the tree, the dequeued structure will be
delivered to the process in question, whereas currently and normally, the
signal number is what is used.
2002-10-01 00:07:28 +00:00
John Baldwin
dc183990ca - Add a new per-process flag PS_XCPU to indicate that at least one thread
has exceeded its CPU time limit.
- In mi_switch(), set PS_XCPU when the CPU time limit is exceeded.
- Perform actual CPU time limit exceeded work in ast() when PS_XCPU is set.

Requested by:	many
2002-09-30 21:13:54 +00:00
John Baldwin
f4cd8f9ff4 Change p_cpulimit to be in seconds instead of microseconds. Since
p_runtime now is a bintime, it is no longer an optimization to store
p_cpulimit as microseconds.

Suggested by:	phk
2002-09-30 21:08:38 +00:00
Robert Watson
0626774f08 Move vnode MAC label initialization to after the release of the vnode
interlock in getnewvnode() to avoid possible sleeps while holding
the mutex.  Note that the warning from Witness is a slight false
positive since we know there will be no contention on the interlock
since we haven't made the vnode available for use yet, but the theory
is not a bad one.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-09-30 20:51:48 +00:00
Robert Watson
c031391bd5 Add tunables for the existing sysctl twiddles for pipe and vm
enforcement so they can be disabled prior to kernel start.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-09-30 20:50:00 +00:00
Juli Mallett
1226f694e6 First half of implementation of ksiginfo, signal queues, and such. This
gets signals operating based on a TailQ, and is good enough to run X11,
GNOME, and do job control.  There are some intricate parts which could be
more refined to match the sigset_t versions, but those require further
evaluation of directions in which our signal system can expand and contract
to fit our needs.

After this has been in the tree for a while, I will make in kernel API
changes, most notably to trapsignal(9) and sendsig(9), to use ksiginfo
more robustly, such that we can actually pass information with our
(queued) signals to the userland.  That will also result in using a
struct ksiginfo pointer, rather than a signal number, in a lot of
kern_sig.c, to refer to an individual pending signal queue member, but
right now there is no defined behaviour for such.

CODAFS is unfinished in this regard because the logic is unclear in
some places.

Sponsored by:	New Gold Technology
Reviewed by:	bde, tjr, jake [an older version, logic similar]
2002-09-30 20:20:22 +00:00
Poul-Henning Kamp
50c2233141 Plug memory leaks.
Detected by:	FlexeLint
Approved by:	jhb
2002-09-30 19:19:47 +00:00
Julian Elischer
2735483034 uh, commit all of the patch 2002-09-29 23:28:58 +00:00
Julian Elischer
e081731767 commit the version I actually tested..
Submitted by:	davidxu
2002-09-29 23:23:25 +00:00
Julian Elischer
9eb1fdea37 Implement basic KSE loaning. This stops a hread that is blocked in BOUND mode
from stopping another thread from completing a syscall, and this allows it to
release its resources etc. Probably more related commits to follow (at least
one I know of)

Initial concept by: julian, dillon
Submitted by:	davidxu
2002-09-29 23:04:34 +00:00
David E. O'Brien
21b68415cd Fix style nit where conditionally compiled code was unconditionalized,
but style(9) was consulted.

Submitted by:	bde
2002-09-29 04:47:41 +00:00
Julian Elischer
0cd3964f6d lock proc while calling psignal
(plus related cleanups)

Submitted by:	davidxu
2002-09-29 02:48:37 +00:00
Poul-Henning Kamp
651dde1b81 Move includ of <sys/bus_priate.h> later to get semantic identity of
device_t the same throughout kernel.

This is a very fine point of C which fortunatly does not make any
difference in normal circumstances but which due to the pervasiveness
of device_t in the kernel can make a lint barf a lot.
2002-09-28 21:38:35 +00:00
Poul-Henning Kamp
2f9752e95e Change a return to a break so the local buffers get properly freeed.
Spotte by:	FlexeLint

Reviewed by:	rwatson
2002-09-28 21:34:31 +00:00
Poul-Henning Kamp
53cc479393 Remove unused includes.
Clarify the intention of a while();
Move a local variable to avoid potential name-confusion.
2002-09-28 17:46:30 +00:00
Poul-Henning Kamp
37c841831f Be consistent about "static" functions: if the function is marked
static in its prototype, mark it static at the definition too.

Inspired by:    FlexeLint warning #512
2002-09-28 17:15:38 +00:00
Poul-Henning Kamp
54286a04c5 Correctly order VI_UNLOCK(), local variables and block comment. 2002-09-28 12:15:44 +00:00
Julian Elischer
165d2b993c Rewrite the kse_create() function to better aproach the semantics we
have specified in the design.
2002-09-28 08:44:31 +00:00
Jake Burkholder
169d513cb4 Add a workaround for what seems to be confusion between binutils and the
sparc v9 ABI.  The Elf_Rela records for local symbols appear to already
have the symbol's value added in to the addend field, even though the ABI
specifies we need to lookup the symbol and add its value too.  This breaks
text relocations in klds because the symbol's value is added twice, and
the resulting address points off into nowhere land, so for now just use
the addend.

Tested by:	rwatson
2002-09-27 23:12:53 +00:00
Poul-Henning Kamp
ca916247cd Rename struct specinfo to the more appropriate struct cdev.
Agreed on:	jake, rwatson, jhb
2002-09-27 18:27:10 +00:00
Julian Elischer
3d0586d4f2 Redo how completing threads pass their state to userland
if they are not going to cross over themselves. Also change how the list of
completed user threads is tracked and passed to the KSE. This is not
a change in design but rather the implementation of what was originally
envisionned.
2002-09-27 07:11:11 +00:00
Poul-Henning Kamp
3c275c19c4 Under DIAGNOSTIC, complain if ENOIOCTL leaks out through VOP_IOCTL(). 2002-09-26 21:21:13 +00:00
Poul-Henning Kamp
089cf428da Make biowait() check bio_error before the BIO_ERROR flag, to propery
catch internal GEOM use of bio_error.

Sponsored by:	DARPA & NAI Labs.
2002-09-26 16:32:14 +00:00
Jeff Roberson
a414302f90 - Export the alq daemon thread pointer.
- Don't log ktr events from the alq daemon.
2002-09-26 07:38:56 +00:00
Jeff Roberson
6423c9433c - Move ASSERT_VOP_*LOCK* functionality into functions in vfs_subr.c
- Make the VI asserts more orthogonal to the rest of the asserts by using a
   new, common vfs_badlock() function and adding a 'str' arg.
 - Adjust generated ASSERTS to match the new prototype.
 - Adjust explicit ASSERTS to match the new prototype.
2002-09-26 04:48:44 +00:00
Jeff Roberson
6fc15f9bdf - We don't need any automated lock checking for vop_islocked. 2002-09-26 00:31:16 +00:00
Archie Cobbs
89def71cbd Make the following name changes to KSE related functions, etc., to better
represent their purpose and minimize namespace conflicts:

	kse_fn_t		-> kse_func_t
	struct thread_mailbox	-> struct kse_thr_mailbox
	thread_interrupt()	-> kse_thr_interrupt()
	kse_yield()		-> kse_release()
	kse_new()		-> kse_create()

Add missing declaration of kse_thr_interrupt() to <sys/kse.h>.
Regenerate the various generated syscall files. Minor style fixes.

Reviewed by:	julian
2002-09-25 18:10:42 +00:00
Bruce Evans
ac0653dcc8 Round up instead of towards 0 in clock_getres() so that a resolution of
0 is never returned.

PR:		41781
MFC after:	3 days
2002-09-25 12:00:38 +00:00
Jeff Roberson
6cb8bf2027 - Lock down the syncer with sync_mtx.
- Enable vfs_badlock_mutex by default.
 - Assert that the vp is locked in VOP_UNLOCK.
 - Use standard interlock macros in remaining code.
 - Correct a race in getnewvnode().
 - Lock access to v_numoutput with interlock.
 - Lock access to buf lists and splay tree with interlock.
 - Add VOP and VI asserts.
 - Lock b_vnbufs with the vnode interlock.
 - Add vrefcnt() for callers who want to retreive the vnode ref without
   holding a lock.  Add a comment that describes when this is safe.
 - Add vholdl() and vdropl() so that callers who already own the interlock
   can avoid race conditions and unnecessary unlocking.
 - Move the VOP_GETATTR() in vflush() into the WRITECLOSE conditional case.
 - Hold the interlock before droping the mntlist_mtx in vflush() to avoid
   a race.
 - Fix locking in vfs_msync().
2002-09-25 02:22:21 +00:00
Jeff Roberson
d40a8125f5 - Properly lock v_vflags in getdirents(). 2002-09-25 02:13:38 +00:00
Jeff Roberson
d64370cb30 - Use incore() where no other interlock locking is necessary.
- Lock access to numoutput.
2002-09-25 02:12:32 +00:00
Jeff Roberson
b7227b7712 - Lock accesses to v_numoutput.
- Lock calls to gbincore.
2002-09-25 02:11:37 +00:00
Jeff Roberson
609058e884 - Don't protect mountedhere with the vn interlock.
- Protect mountedhere with the vn lock.
2002-09-25 01:44:21 +00:00
Jeff Roberson
3cc511c528 - Use the standard vp interlock macros. 2002-09-25 01:42:24 +00:00
Julian Elischer
ed32df81e8 Don't use local variable 'p' in a debug statement.. we removed it. 2002-09-23 14:06:12 +00:00
Julian Elischer
10b33e6b2c oops don't do dthe copy range in a new KSE. There isn't one any more. 2002-09-23 14:01:01 +00:00
Julian Elischer
253fdd5ba9 slightly clean up the thread_userret() and thread_consider_upcall() calls.
also some slight changes for TDF_BOUND testing and small style changes
Should ONLY affect KSE programs

Submitted by:	davidxu
2002-09-23 06:14:30 +00:00
Julian Elischer
acb460624e Add code to create > 1 KSe per process.
(support code not yet complete)

Submitted by:	davidxu
2002-09-23 06:10:24 +00:00
Julian Elischer
33c06e1d3e Indentation does not define a block.. you need breces {} as well..
also add a mutex assert.  (threaded path only)

Submitted by:	davidxu
2002-09-23 05:27:30 +00:00
Jeff Roberson
9e9256e252 - Hold the credential of the caller and use it in all subsequent vn ops.
- Get rid of the ill conceived aq_td field.

Suggested by:	rwatson
2002-09-23 05:20:00 +00:00
Jeff Roberson
abee588b36 - Add support for logging KTR via ALQ. This is optional and enabled by the
KTR_ALQ config option.
2002-09-22 07:13:45 +00:00
Jeff Roberson
c76e20451c - Tell witness about ALQ's spin lock. 2002-09-22 07:11:57 +00:00
Jeff Roberson
9405072a95 - Add an asynchronous fixed length record logging mechanism called
ALQ (Asynch. Logging Queues).  ALQ supports many seperate queues with
   different record and buffer sizes.  It opens and logs to any vnode so
   it can be used with character devices as well as regular files.

Reviewed in part by:	phk, jake, markm
2002-09-22 07:11:14 +00:00
Jake Burkholder
98f93c07a5 Removed unneeded include (missed in last revision). 2002-09-22 06:05:23 +00:00
Jake Burkholder
e3b6e33c07 Moved netisr code from kern/kern_intr.c to net/netisr.c as threatened in a
comment.
2002-09-22 05:56:41 +00:00
Jake Burkholder
05ba50f522 Use the fields in the sysentvec and in the vm map header in place of the
constants VM_MIN_ADDRESS, VM_MAXUSER_ADDRESS, USRSTACK and PS_STRINGS.
This is mainly so that they can be variable even for the native abi, based
on different machine types.  Get stack protections from the sysentvec too.
This makes it trivial to map the stack non-executable for certain abis, on
machines that support it.
2002-09-21 22:07:17 +00:00
Poul-Henning Kamp
66cdbc28d0 Assert my copyright on this file (using the default 2-clause BSD).
The vast majority of the contents is from my keyboard and no
significant pieces remain of the former copyright holders code.
2002-09-20 22:26:27 +00:00
Poul-Henning Kamp
7812d86f03 (This commit touches about 15 disk device drivers in a very consistent
and predictable way, and I apologize if I have gotten it wrong anywhere,
getting prior review on a patch like this is not feasible, considering
the number of people involved and hardware availability etc.)

If struct disklabel is the messenger: kill the messenger.

Inside struct disk we had a struct disklabel which disk drivers used to
communicate certain metrics to the disklayer above (GEOM or the disk
mini-layer).  This commit changes this communication to use four
explicit fields instead.

Amongst the benefits is that the fields do not get overwritten by
wrong or bogus on-disk disklabels.

Once that is clear, <sys/disk.h> which is included in the drivers
no longer need to pull <sys/disklabel.h> and <sys/diskslice.h> in,
the few places that needs them, have gotten explicit #includes for
them.

The disklabel inside struct disk is now only for internal use in
the disk mini-layer, so instead of embedding it, we malloc it as
we need it.

This concludes (modulus any mistakes) the series of disklabel related
commits.

I belive it all amounts to a NOP for all the rest of you :-)

Sponsored by:   DARPA & NAI Labs.
2002-09-20 19:36:05 +00:00
Poul-Henning Kamp
6fb3d70418 For reasons now lost in historical fog, the bounds_check_with_label()
function were put in i386/i386/machdep.c from where it has been
cut and pasted to other architectures with only minor corruption.

Disklabel is really a MI format in many ways, at least it certainly
is when you operate on struct disklabel.

Put bounds_check_with_label() back in subr_disklabel.c where it belongs.

Sponsored by:   DARPA & NAI Labs.
2002-09-20 17:51:00 +00:00
Poul-Henning Kamp
2e45c1b191 We don't need the <sys/disklabel.h> include for alpha anymore.
Sponsored by:	DARPA & NAI Labs.
2002-09-20 17:45:44 +00:00
Poul-Henning Kamp
2382fb0a84 Make FreeBSD "struct disklabel" agnostic, step 312 of 723:
Rename bioqdisksort() to bioq_disksort().
Keep a #define around to avoid changing all diskdrivers right now.

Move it from subr_disklabel.c to subr_disk.c.
Move prototype from <sys/disklabel.h> to <sys/bio.h>

Sponsored by:   DARPA and NAI Labs.
2002-09-20 14:14:37 +00:00
Poul-Henning Kamp
f90c382c0c Make FreeBSD "struct disklabel" agnostic, step 311 of 723:
Rename diskerr() to disk_err() for naming consistency.

Drop the by now entirely useless struct disklabel argument.

Add a flag argument for new-line termination.

Fix a couple of printf-format-casts to %j instead of %l.

Correctly print the name of all bio commands.

Move the function from subr_disklabel.c to subr_disk.c,
and from <sys/disklabel.h> to <sys/disk.h>.

Use the new disk_err() throughout, #include <sys/disk.h> as needed.

Bump __FreeBSD_version for the sake of the aac disk drivers #ifdefs.

Remove unused disklabel members of softc for aac, amr and mlx, which seem
to originally have been intended for diskerr() use, but which only rotted
and got Copy&Pasted at least two times to many.

Sponsored by:   DARPA & NAI Labs.
2002-09-20 12:52:03 +00:00
Poul-Henning Kamp
837c5e5c2b Remove unused variable. 2002-09-20 09:33:30 +00:00
Poul-Henning Kamp
46714777f5 Retire now unused DIOCGDVIRGIN kludge.
Sponsored by:	DARPA & NAI Labs.
2002-09-20 09:31:14 +00:00
Maxime Henrion
e2587e98e5 Switch to using strlcpy() in several places. It seems there
were cases where we could get unterminated strings before.
2002-09-19 18:54:22 +00:00
John Baldwin
e485b64b08 Add ability to dump stacktraces on kernel panics when DDB is compiled into
the kernel.  By default this is turned off since otherwise it could scroll
valuable panic messages off of the screen.  This option can be turned on
by the DDB_TRACE kernel option as well as the debug.trace_on_panic sysctl.

Also, fix the DDB_UNATTENDED option to use its own header instead of
abusing opt_ddb.h.  This way turning that one option on or off doesn't
force you to recompile all of ddb.

Requested by:	many (1), bde (2*)

* - I know bde prefers !abusing option headers in general but can't
    remember if he as brought up this specific case.
2002-09-19 18:49:46 +00:00
Don Lewis
fa288043e2 VOP_FSYNC() requires that it's vnode argument be locked, which nfs_link()
wasn't doing.  Rather than just lock and unlock the vnode around the call
to VOP_FSYNC(), implement rwatson's suggestion to lock the file vnode
in kern_link() before calling VOP_LINK(), since the other filesystems
also locked the file vnode right away in their link methods.  Remove the
locking and and unlocking from the leaf filesystem link methods.

Reviewed by:	rwatson, bde  (except for the unionfs_link() changes)
2002-09-19 13:32:45 +00:00
Julian Elischer
4a3276d4a4 While well intentionned the check to see it there is a packet
header and return that length, was misguided.

The check itself didn't take into account the fact that the
mbuf pointer pased in may be null, and the function is
defined specifically for cases where the caller knows what it wants.
Rather than fix the check I'm removing it as phk suggested.

Submitted by:	 phk@freebsd.org
2002-09-19 08:28:41 +00:00
Julian Elischer
4a49235b89 fix style.. Return in the kernel always has () around the arguments. 2002-09-19 03:18:44 +00:00
Julian Elischer
1494277d50 Compiler was correct:
m WAS being used uninitialized..
2002-09-19 03:15:39 +00:00
Darren Reed
e62497713c If M_PKTHDR is set then we don't need to do a loop to find the total length. 2002-09-19 01:21:24 +00:00
Alfred Perlstein
3ffb9fadc8 Regen for added syscalls. 2002-09-19 00:48:57 +00:00
Alfred Perlstein
6d5dec35b7 Add the rest of the kernel support for the sem_ API in kern/uipc_sem.c.
Option 'P1003_1B_SEMAPHORES' to compile them in, or load the "sem" module
to activate them.

Have kern/makesyscalls.sh emit an include for sys/_semaphore.h into sysproto.h
to pull in the typedef for semid_t.

Add the syscalls to the syscall table as module stubs.
2002-09-19 00:43:32 +00:00
Alfred Perlstein
efaa658806 Bring in my implementation of kernel support for posix realtime semaphores
that are shareable between processes.

There will be a cleanup shortly along with the necessary changes made to
libc, libc_r, libpthread as well as the hooks into sys/conf and sys/modules.
2002-09-18 22:47:42 +00:00
Robert Watson
cc51a2b55e Remove un-needed stack variable 'ops'.
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, Network Associates Laboratories
2002-09-18 22:35:02 +00:00
Bosko Milekic
bd395ae8f6 style nit: unsigned -> u_int in the kernel, particularly to
stay consistent in this file, and keep m_length() and m_fixhdr()
consistent with their prototypes in mbuf.h

Inspired by: bde
2002-09-18 22:33:52 +00:00
Nate Lawson
86ed6d45ac Remove any VOP_PRINT that redundantly prints the tag.
Move lockmgr_printinfo() into vprint() for everyone's benefit.

Suggested by: bde
2002-09-18 20:42:04 +00:00
Poul-Henning Kamp
7ed60de837 Use m_length() instead of home-rolled versions. 2002-09-18 19:44:14 +00:00
Poul-Henning Kamp
4e4425d486 Make m_length() and m_fixhdr() return unsigned.
Suggested by:	arr
2002-09-18 19:42:06 +00:00
Poul-Henning Kamp
ac6e585d24 Introduce the m_length() function which will return the accumulated
length of an mbuf-chain and optionally a pointer to the last mbuf.
2002-09-18 14:57:35 +00:00
Poul-Henning Kamp
3f2e06c5e1 Move m_fixhdr() from "mbchain" to "mbuf" where it belongs. 2002-09-18 13:41:37 +00:00
Jeff Roberson
99571dc345 - Split UMA_ZFLAG_OFFPAGE into UMA_ZFLAG_OFFPAGE and UMA_ZFLAG_HASH.
- Remove all instances of the mallochash.
 - Stash the slab pointer in the vm page's object pointer when allocating from
   the kmem_obj.
 - Use the overloaded object pointer to find slabs for malloced memory.
2002-09-18 08:26:30 +00:00
Robert Watson
ca7850c313 Add a toggle to disable VM enforcement.
Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-09-18 02:02:08 +00:00
Robert Watson
b88c98f6b1 At the cost of seeming a little gauche, make use of more traditional
alphabetization for mac_enforce_pipe sysctl.

Obtained from:	TrustedBSD Project
Sponsored by:	DAPRA, NAI Labs
2002-09-18 02:00:19 +00:00
Robert Watson
289c6dea76 Don't call VOP_LEASE() while holding the accounting mutex. 2002-09-18 01:56:13 +00:00
Peter Wemm
acaa156683 Argh. I've been reading makefiles for too long. Change comment to a
C-style comment.
2002-09-17 07:41:30 +00:00
Peter Wemm
1e19df3303 Stub out the calls to get_mcontext and set_mcontext which only exist on
i386.  This stuff should not be prototyped in MD inludes if the interface
is expected to be MI.
2002-09-17 07:40:15 +00:00
Peter Wemm
66422f5b7a Initiate deorbit burn for the i386-only a.out related support. Moves are
under way to move the remnants of the a.out toolchain to ports.  As the
comment in src/Makefile said, this stuff is deprecated and one should not
expect this to remain beyond 4.0-REL.  It has already lasted WAY beyond
that.

Notable exceptions:
gcc - I have not touched the a.out generation stuff there.
ldd/ldconfig - still have some code to interface with a.out rtld.
old as/ld/etc - I have not removed these yet, pending their move to ports.
some includes - necessary for ldd/ldconfig for now.

Tested on: i386 (extensively), alpha
2002-09-17 01:49:00 +00:00
Jonathan Mini
c76e33b681 Add kernel support needed for the KSE-aware libpthread:
- Use ucontext_t's to store KSE thread state.
	- Synthesize state for the UTS upon each upcall, rather than
	  saving and copying a trapframe.
	- Deliver signals to KSE-aware processes via upcall.
	- Rename kse mailbox structure fields to be more BSD-like.
	- Store the UTS's stack in struct proc in a stack_t.

Reviewed by:	bde, deischen, julian
Approved by:	-arch
2002-09-16 19:26:48 +00:00
Poul-Henning Kamp
7b08810243 Add a cast to make this file compile in userland on sparc64 without
warnings.
2002-09-16 18:45:18 +00:00
Thomas Moestl
dde1c2c0d6 fcntl(..., F_SETLKW, ...) takes a pointer to a struct flock just like
F_SETLK does, so it also needs this structure copied in in fnctl() before
calling kern_fcntl().
2002-09-16 01:05:15 +00:00