Commit Graph

9582 Commits

Author SHA1 Message Date
rwatson
7beaaf5cd2 Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h.  sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from:	TrustedBSD Project
Sponsored by:	SPARTA
2006-10-22 11:52:19 +00:00
alc
cbcb760109 Replace PG_BUSY with VPO_BUSY. In other words, changes to the page's
busy flag, i.e., VPO_BUSY, are now synchronized by the per-vm object
lock instead of the global page queues lock.
2006-10-22 04:28:14 +00:00
davidxu
5bbfc34004 Use macro TAILQ_FOREACH_SAFE instead of expanding it. 2006-10-22 00:09:41 +00:00
davidxu
df9c81e665 Since revision 1.333 of kern_sig.c no longer uses P_WEXIT, the change
opened a race window which can cause memory leak in signal queue.
Here we free memory for signal queue when process state is set to
PRS_ZOMBIE.
2006-10-21 23:59:15 +00:00
jhb
4cc0b2f46e Remove the check that prevented signals from being delivered to exiting
processes.  It was originally added back when support for Linux threads
(and thus shared sigacts objects) was added, but no one knows why.  My
guess is that at some point during the Linux threads patches, the sigacts
object was torn down during exit1(), so this check was added to prevent
a panic for that race.  However, the stuff that was actually committed to
the tree doesn't teardown sigacts until wait() making the above race moot.
Re-allowing signals here lets one interrupt a NFS request during process
teardown (such as closing descriptors) on an interruptible mount.

Requested by:	kib (long time ago)
MFC after:	1 week
2006-10-20 16:19:21 +00:00
kib
5f5bf9dadc Fix the race between devfs_fp_check and devfs_reclaim. Derefence the
vnode' v_rdev and increment the dev threadcount , as well as clear it
(in devfs_reclaim) under the dev_lock().

Reviewed by:	tegge
Approved by:	pjd (mentor)
2006-10-20 07:59:50 +00:00
bde
57b90fd011 kern_intr.c:
- Count (scheduling of) software interrupts (SWIs) as SWIs, not as
  hardware interrupts.
- Don't count (scheduling of) delayed SWIs as interrupts at all, since
  in the delayed case it is expected that there are many more scheduling
  calls than handling calls.  Perhaps all interrupts should be counted
  only when they are handled, but it is only counts of delayed SWIs that
  shouldn never be combined with the other counts.

subr_trap.c:
- Count (handling of) Asynchronous System Traps (ASTs) as traps, not as
  software interrupts.

Before these changes, the counter for SWIs only counted ASTs, and SWIs
weren't counted separately, but a subcounter for ASTs alone is less
needed than for most other exception sources.

4.4BSD-Lite uses the counters for similar things (actually matching
their names) on its main arches (hp300, ..., !i386) where more of the
exceptions are in hardware.
2006-10-18 04:48:09 +00:00
davidxu
9e7c324e6e Regenerate. 2006-10-17 02:28:58 +00:00
davidxu
bb5a3880aa o Add keyword volatile for user mutex owner field.
o Fix type consistent problem by using type long for old
  umtx and wait channel.
o Rename casuptr to casuword.
2006-10-17 02:24:47 +00:00
netchild
183bd5a34b MFP4 (with some minor changes):
Implement the linux_io_* syscalls (AIO). They are only enabled if the native
AIO code is available (either compiled in to the kernel or as a module) at
the time the functions are used. If the AIO stuff is not available there
will be a ENOSYS.

From the submitter:
---snip---
DESIGN NOTES:

1. Linux permits a process to own multiple AIO queues (distinguished by
   "context"), but FreeBSD creates only one single AIO queue per process.
   My code maintains a request queue (STAILQ of queue(3)) per "context",
   and throws all AIO requests of all contexts owned by a process into
   the single FreeBSD per-process AIO queue.

   When the process calls io_destroy(2), io_getevents(2), io_submit(2) and
   io_cancel(2), my code can pick out requests owned by the specified context
   from the single FreeBSD per-process AIO queue according to the per-context
   request queues maintained by my code.

2. The request queue maintained by my code stores contrast information between
   Linux IO control blocks (struct linux_iocb) and FreeBSD IO control blocks
   (struct aiocb). FreeBSD IO control block actually exists in userland memory
   space, required by FreeBSD native aio_XXXXXX(2).

3. It is quite troubling that the function io_getevents() of libaio-0.3.105
   needs to use Linux-specific "struct aio_ring", which is a partial mirror
   of context in user space. I would rather take the address of context in
   kernel as the context ID, but the io_getevents() of libaio forces me to
   take the address of the "ring" in user space as the context ID.

   To my surprise, one comment line in the file "io_getevents.c" of
   libaio-0.3.105 reads:

             Ben will hate me for this

REFERENCE:

1. Linux kernel source code:   http://www.kernel.org/pub/linux/kernel/v2.6/
   (include/linux/aio_abi.h, fs/aio.c)

2. Linux manual pages:         http://www.kernel.org/pub/linux/docs/manpages/
   (io_setup(2), io_destroy(2), io_getevents(2), io_submit(2), io_cancel(2))

3. Linux Scalability Effort:   http://lse.sourceforge.net/io/aio.html
   The design notes:           http://lse.sourceforge.net/io/aionotes.txt

4. The package libaio, both source and binary:
       http://rpmfind.net/linux/rpm2html/search.php?query=libaio
   Simple transparent interface to Linux AIO system calls.

5. Libaio-oracle:              http://oss.oracle.com/projects/libaio-oracle/
   POSIX AIO implementation based on Linux AIO system calls (depending on
   libaio).
---snip---

Submitted by:	Li, Xiao <intron@intron.ac>
2006-10-15 14:22:14 +00:00
ru
1bc0f5fc76 Prevent IOC_IN with zero size argument (this is only supported
if backward copatibility options are present) from attempting
to free memory that wasn't allocated.  This is an old bug, and
previously it would attempt to free a null pointer.  I noticed
this bug when working on the previous revision, but forgot to
fix it.

Security:	local DoS
Reported by:	Peter Holm
MFC after:	3 days
2006-10-14 19:01:55 +00:00
trhodes
e7d9ab0897 Close a race condition where num can be larger than tmp, giving the user
too large of a boundary.

Reported by:	Ilja Van Sprundel
2006-10-14 10:30:14 +00:00
tegge
2d3b850cd6 Wait for thread count to reach zero in destroy_devl() even when no purge
method is defined, to avoid memory being modified after free.

Temporarily increase refcount in destroy_devl() to avoid a double free
if dev_rel() is called while waiting for thread count to reach zero.
2006-10-13 20:49:24 +00:00
glebius
7fc0346961 Improve ktr(4) logging for callout(9) subsystem. Log all inserts and
removals, including failures, into the callwheel.

XXX: Most of the CTR() macros are called with callout_lock spin mutex
held, thus won't be logged into file, if KTR_ALQ is used. Moving the
CTR() macros out from the spinlocked code would require copying of all
arguments. I'm too lazy to do this.
2006-10-11 14:57:03 +00:00
davidxu
c5bda619e9 Implement 32bit umtx_lock and umtx_unlock system calls, these two system
calls are not used by libthr in RELENG_6 and HEAD, it is only used by
the libthr in RELENG-5, the _umtx_op system call can do more incremental
dirty works than these two system calls without having to introduce new
system calls or throw away old system calls when things are going on.
2006-10-06 08:22:08 +00:00
davidxu
0fa66af83d Move some declaration of 32-bit signal structures into file
freebsd32-signal.h, implement sigtimedwait and sigwaitinfo system calls.
2006-10-05 01:56:11 +00:00
mbr
7d21b9894a Back out part of rev. 1.149. While adding a workaround in ptcopen() to
avoid leaked ptys works fine, this opens a possible security hole.

Submitted by:	bde
MFC after:	3 days
2006-10-04 05:43:39 +00:00
rwatson
dafc0585f0 Regenerate. 2006-10-03 20:48:11 +00:00
rwatson
b54cd1fd3c Audit creat() system call (compat code), and change type for getpagesize(),
which isn't actually being audited anyway.

MFC after:	3 days
Obtained from:	TrustedBSD Project
2006-10-03 20:46:52 +00:00
kib
38b2c3b26f Fix the remaining race in the revs. 1.232, 1,233 that could occur during
unmount when mp structure is reused while waiting for coveredvp lock.
Introduce struct mount generation count, increment it on each reuse and
compare the generations before and after obtaining the coveredvp lock.

Reviewed by:	tegge, pjd
Approved by:	pjd (mentor)
MFC after:	2 weeks
2006-10-03 10:47:04 +00:00
phk
638e020bc6 Use utc_offset() where applicable, and hide the internals of it
as static variables.
2006-10-02 18:23:37 +00:00
phk
52fd275504 Introduce utc_offset() to capture a calculation currently done all over the
place.
2006-10-02 16:17:23 +00:00
phk
7a874f961d Move tz_minuteswest and tz_dsttime to subr_clock.c 2006-10-02 16:06:26 +00:00
phk
84cc0f277f Second part of a little cleanup in the calendar/timezone/RTC handling.
Split subr_clock.c in two parts (by repo-copy):
   subr_clock.c contains generic RTC and calendaric stuff. etc.
   subr_rtc.c contains the newbus'ified RTC interface.

Centralize the machdep.{adjkerntz,disable_rtc_set,wall_cmos_clock}
sysctls and associated variables into subr_clock.c.  They are
not machine dependent and we have generic code that relies on being
present so they are not even optional.
2006-10-02 15:42:02 +00:00
phk
50c81b8a9a First part of a little cleanup in the calendar/timezone/RTC handling.
Move relevant variables to <sys/clock.h> and fix #includes as necessary.

Use libkern's much more time- & spamce-efficient BCD routines.
2006-10-02 12:59:59 +00:00
kib
aa82c72808 Correct the comment: numvnodes is decreased on vdestroying the vnode.
OKed by:	tegge
Approved by:	pjd (mentor)
MFC after:	1 week
2006-10-02 07:25:58 +00:00
tegge
4d26d3d4d7 If the buffer lock has waiters after the buffer has changed identity then
getnewbuf() needs to drop the buffer in order to wake waiters that might
sleep on the buffer in the context of the old identity.
2006-10-02 02:06:27 +00:00
mbr
7e18a19928 Readd rev. 1.145 because of vfs bugs and races near revoke(). Until they
are fixed we can't free any slaves. Add a workaround to not to leak ptys
by number.
2006-09-30 22:51:05 +00:00
pjd
1358d3f1ae Remove duplicated $FreeBSD$. 2006-09-30 16:33:29 +00:00
mbr
b63df50b58 Any call of tty_close() with a tty refcount of <= 1 is wrong and we will
free the tty in this case. This is a workaround until the underlaying
devfs/tty problems are fixed.

MFC after:	1 day
2006-09-30 08:11:51 +00:00
mbr
06d2753424 Free tty struct after last close. This should fix the pty-leak by numbers.
Remove workarounds for tty_refcount beeing 0, this will be fixed differently
later.
2006-09-29 09:53:19 +00:00
mbr
ed48249ff4 Free tty struct after last close. This should fix the pty-leak by numbers.
Remove workarounds for tty_refcount beeing 0, this will be fixed differently
later.

Back out rev 1.145 since we initialize the tty struct from scratch and bad
things can't happen anymore.
2006-09-29 09:52:57 +00:00
ru
4ef62e4ca5 Fix our ioctl(2) implementation when the argument is "int". New
ioctls passing integer arguments should use the _IOWINT() macro.
This fixes a lot of ioctl's not working on sparc64, most notable
being keyboard/syscons ioctls.

Full ABI compatibility is provided, with the bonus of fixing the
handling of old ioctls on sparc64.

Reviewed by:	bde (with contributions)
Tested by:	emax, marius
MFC after:	1 week
2006-09-27 19:57:02 +00:00
mbr
24e5e6df3a Move Giant up even further since P_CONTROLT isn't really fully locked
yet (p_flag is, but P_CONTROLT isn't really).

Submitted by:	jhb
2006-09-27 16:42:10 +00:00
mbr
b07486e6c4 Use ctty instead of just returning. ctty just has a simple open that
returns ENXIO.

Submitted by:	jhb
2006-09-27 16:41:15 +00:00
tegge
89ea8a9b1b Reduce fluctuations of mnt_flag to allow unlocked readers to get a
slightly more consistent view.
2006-09-26 04:20:09 +00:00
tegge
a802c35eb4 Don't restore MNT_QUOTA bit in mnt_flag after a failed mount with
MNT_UPDATE flag, closing a race between nmount() and quotactl().
2006-09-26 04:18:36 +00:00
tegge
f42473d76b Add mnt_noasync counter to better handle interleaved calls to nmount(),
sync() and sync_fsync() without losing MNT_ASYNC.  Add MNTK_ASYNC flag
which is set only when MNT_ASYNC is set and mnt_noasync is zero, and
check that flag instead of MNT_ASYNC before initiating async io.
2006-09-26 04:15:59 +00:00
tegge
b500b0ae92 Don't restore mnt_kern_flag on failed MNT_UPDATE mount, it can race
with dounmount(), causing loss of MNTK_UNMOUNT flag.
2006-09-26 04:15:04 +00:00
tegge
83154f853d Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag.
This eliminates a race where MNT_UPDATE flag could be lost when nmount()
raced against sync(), sync_fsync() or quotactl().
2006-09-26 04:12:49 +00:00
rwatson
aa39d5a21d SI_ORDER_THIRD + 2, not SI_ORDER_FOURTH + 2.
MFC after:	3 days
Submitted by:	mlaier
2006-09-26 00:15:56 +00:00
rwatson
ffe75b8d42 Add "FreeBSD" trademark statement to copyright section of boot messages.
MFC after:	3 days
Approved by:	core, board at FreeBSDFoundation dot org
2006-09-25 23:19:01 +00:00
jmg
2b64b4e601 remove unnecessary NULL check...
Coverity ID:	1545
2006-09-25 01:29:48 +00:00
jmg
28f00353ba hide kqueue_register from public view, and replace it w/ kqfd_register...
this eliminates a possible race in aio registering a kevent..
2006-09-24 04:47:47 +00:00
jmg
08d150cea6 return EBADF instead of successfully attaching (and then panicing) when
an fd is dieing..

Convinced by:	jhb
PR:		103127
2006-09-24 02:29:53 +00:00
jmg
689b605f8b add KTRACE hooks into kevent... This will help people debug their kqueue
programs to find out exactly which events were registered and which were
returned...  This should be lower in kern_kevent, but that would require
special munging due to locks and the functions used to copyin/copyout
kevents...

If someone wants to teach ktrace how to output pretty kevents, I have a
kevent prety printer that can be used...
2006-09-24 02:23:29 +00:00
mbr
f58415b4f5 Protect enterpgrp() against another tty/proc race case until the tty locking work
has been fixed.

MFC after:	1 week
2006-09-23 17:35:24 +00:00
mbr
9af53fcaab Check for tp->t_refcnt == 0 before doing anything in tty_open().
PR:		103520
MFC after:	1 week
2006-09-23 14:52:46 +00:00
mbr
35537661a5 If /dev/tty gets opened after your controlling terminal has been revoked
you can't call tty_clone afterwords. OpenBSD and NetBSD both fail the
open call in that case, so we should do so as well. This can
be done in ctty_clone by returning with *dev==NULL. Admittedly this
causes open to return ENOENT, instead of ENXIO as on the other BSDs,
but this way requires the least touching of code.

Submitted by:  Nate Eldredge <nge@cs.hmc.edu>
PR:            83375

MFC:           1 week
2006-09-23 14:44:14 +00:00
bms
fdb36f8d07 Fix a case where socket I/O atomicity is violated due to not dropping
the entire record when a non-data mbuf is removed in the soreceive() path.
This only triggers a panic directly when compiled with INVARIANTS.

PR:		38495
Submitted by:	James Juran
MFC after:	1 week
2006-09-22 15:34:16 +00:00