Commit Graph

9485 Commits

Author SHA1 Message Date
Doug Ambrisko
51e37c7f37 Make lio ident more consistant with aio ident. 2006-06-02 17:45:48 +00:00
Pawel Jakub Dawidek
f420242b2b Don't forget to unlock kq lock in low memory situations.
OK'ed by:	jmg
2006-06-02 13:23:39 +00:00
Pawel Jakub Dawidek
8ebab14c70 Remove confusing done_noglobal label. The KQ_GLOBAL_UNLOCK() macro know
how to handle both situations - when kq_global lock is and is not held.

OK'ed by:	jmg
2006-06-02 13:21:21 +00:00
Pawel Jakub Dawidek
241321abc0 Use SLIST_FOREACH_SAFE() macro, because knote_drop() can free an element
which can be then used to find next element in the list.

OK'ed by:	jmg
2006-06-02 13:18:59 +00:00
Olivier Houchard
4bb0f51d1d sched_rem() already sets ke->ke_state to KES_THREAD, so there's no need
to redo it.
2006-06-01 22:45:56 +00:00
Diomidis Spinellis
23efd78d03 Remove two locking assertion entries that:
a) were incorrectly written and therefore never compiled into
assertions, and
b) were incorrectly specified and when compiled resulted in a
failed assertion.
2006-05-31 14:06:06 +00:00
Diomidis Spinellis
f69ec7af12 Assertion code specifications are introduced using special character
sequences that are distinct from comments. %% is used for argument
locks; %! for pre- and post-conditions.
2006-05-30 20:49:54 +00:00
Diomidis Spinellis
b1b4282160 Remove incorrect lock validation specifications that caused
failed assertions with DEBUG_VFS_LOCKS.
We should reinstate them with correct specifications, possibly
after extendng vnode_if.awk

Noted by: truckman@
2006-05-30 20:21:51 +00:00
Tor Egge
57051fdc4b Close race between vmspace_exitfree() and exit1() and races between
vmspace_exitfree() and vmspace_free() which could result in the same
vmspace being freed twice.

Factor out part of exit1() into new function vmspace_exit().  Attach
to vmspace0 to allow old vmspace to be freed earlier.

Add new function, vmspace_acquire_ref(), for obtaining a vmspace
reference for a vmspace belonging to another process.  Avoid changing
vmspace refcount from 0 to 1 since that could also lead to the same
vmspace being freed twice.

Change vmtotal() and swapout_procs() to use vmspace_acquire_ref().

Reviewed by:	alc
2006-05-29 21:28:56 +00:00
Xin LI
56e26c3e7e Unexpand TAILQ_FIRST(foo) == NULL to TAILQ_EMPTY(foo). 2006-05-29 05:43:26 +00:00
Kris Kennaway
80a8e5da94 Correct typos
MFC after:	2 weeks
2006-05-28 22:15:28 +00:00
Robert Watson
4bb260ad78 In execve(), audit the path name being executed. In the future, it
would also be good to audit the interpreter pathname, if any.

Obtained from:	TrustedBSD Project
2006-05-28 08:28:47 +00:00
Diomidis Spinellis
0e1c7fb8ea Add missing % signs in the lock annotations of the functions:
lookup, rename, strategy, islocked
The missing % sign meant that the lines were processed as plain
comments and the corresponding assertions were never generated.
2006-05-28 07:24:12 +00:00
Xin LI
e38c7f3ef3 extlen and cpp is not used here in linker_search_kld(), so nuke them.
Reported by:	Mingyan Guo <guomingyan at gmail dot com>
MFC After:	2 weeks
2006-05-27 09:21:41 +00:00
Poul-Henning Kamp
9dd2370db6 If the console has no cncheckc method, use cngetc instead. 2006-05-26 11:00:20 +00:00
Poul-Henning Kamp
8aed7613bd Don't use CONS_DRIVER() macro to insert dummy element in cons_set 2006-05-26 10:46:38 +00:00
Poul-Henning Kamp
16b1613a31 GC the cn_dbctl_t hook for consoles, it is unused.
This used to make syscons switch to vty0 when we entered DDB but this
was lost in the KDB shuffle.  We may want to bring it back down the road
but it should be done by calling cn_init_t/cn_term_t instead, possibly
with a flag argument saying "Debugger!"
2006-05-26 10:24:00 +00:00
Craig Rodrigues
0c89bb0a02 Add "update" mount option to global_opts array,
for use with vfs_filteropt().
2006-05-26 02:38:48 +00:00
Craig Rodrigues
5eb304a91a Remove calls to vfs_export() for exporting a filesystem for NFS mounting
from individual filesystems.  Call it instead in vfs_mount.c,
after we call VFS_MOUNT() for a specific filesystem.
2006-05-26 00:32:21 +00:00
Robert Watson
20bdac8a4f Use getsock() and fput() instead of fgetsock() and fputsock() in
sendfile().  This causes sendfile() to use the file descriptor
reference to the socket instead of bumping the socket reference
count, which avoids an additional refcount operation, as well as a
potential expensive socket refcount drop, which can lead to
contention on the accept mutex.  This change also has the side
effect of further reducing the number of cases where an in-progress
I/O operation can occur on a socket after close, as using the file
descriptor refcount prevents the socket from closing while in use.

MFC after:	3 months
2006-05-25 15:10:13 +00:00
Stephan Uphoff
dcf67e65d2 Do not set B_NOCACHE on buffers when releasing them in flushbuflist().
If B_NOCACHE is set the pages of vm backed buffers will be invalidated.
However clean buffers can be backed by dirty VM pages so invalidating them
can lead to data loss.
Add support for flush dirty page in the data invalidation function
of some network file systems.

This fixes data losses during vnode recycling (and other code paths
using invalbuf(*,V_SAVE,*,*)) for data written using an mmaped file.

Collaborative effort by: jhb@,mohans@,peter@,ps@,ups@
Reviewed by:	tegge@
MFC after:	7 days
2006-05-25 01:00:35 +00:00
Sam Leffler
75b773ae3d When starting up threads in taskqueue_start_threads create them
stopped before adjusting their priority and setting them on the run
q so they cannot race for resources (pointed out by njl).

While here add a console printf on thread create fails; otherwise
noone may notice (e.g. return value is always 0 and caller has no
way to verify).

Reviewed by:	jhb, scottl
MFC after:	2 weeks
2006-05-24 22:11:07 +00:00
David Xu
f705bbe8b1 Don't allow non-root user to set a scheduler policy, otherwise this could
be a local DOS.

Submitted by: Diane Bruce at db at db.net
2006-05-21 00:40:38 +00:00
David Xu
f6c040a2c5 Style fixes.
Submitted by: Diane Bruce < db at db dot net >
2006-05-19 06:37:24 +00:00
David Xu
7b8d821268 Move flag TDF_UMTXQ into structure umtxq, this eliminates the requirement
of scheduler lock in some umtx code.
2006-05-18 08:43:46 +00:00
Poul-Henning Kamp
d595182f0b Make the printfs relating to purging threads from a device less intrusive. 2006-05-17 06:37:14 +00:00
Poul-Henning Kamp
c40da00ca3 Since DELAY() was moved, most <machine/clock.h> #includes have been
unnecessary.
2006-05-16 14:37:58 +00:00
Paul Saab
6befa6ae1b Allow concurrent read(2)/readv(2) access to a file.
Lock file offset against multiple read calls.

Submitted by:	ups
Obtained from:	Yahoo!
MFC after:	2 weeks
2006-05-16 07:50:54 +00:00
Kelly Yancey
c9ad8a67af Restore the ability to mount procfs and fdescfs filesystems via the
mount(2) system call:

  * Add cmount hook to fdescfs and pseudofs (and, by extension, procfs and
    linprocfs).  This (mostly) restores the ability to mount these
    filesystems using the old mount(2) system call (see below for the
    rest of the fix).

  * Remove not-NULL check for the data argument from the mount(2) entry
    point.  Per the mount(2) man page, it is up to the individual
    filesystem being mounted to verify data.  Or, in the case of procfs,
    etc. the filesystem is free to ignore the data parameter if it does
    not use it.  Enforcing data to be not-NULL in the mount(2) system call
    entry point prevented passing NULL to filesystems which ignored the
    data pointer value.  Apparently, passing NULL was common practice
    in such cases, as even our own mount_std(8) used to do it in the
    pre-nmount(2) world.

All userland programs in the tree were converted to nmount(2) long ago,
but I've found at least one external program which broke due to this
(presumably unintentional) mount(2) API change.  One could argue that
external programs should also be converted to nmount(2), but then there
isn't much point in keeping the mount(2) interface for backward
compatibility if it isn't backward compatible.
2006-05-15 19:42:10 +00:00
Benno Rice
77fe443878 The VERBOSE_SYSINIT stuff sees the DDB define a lot better if we include
opt_ddb.h.

Spotted by:	benno
Pointy hat to:	benno
2006-05-14 07:11:28 +00:00
Craig Rodrigues
5250012a1d For nmount(), if "rw" is specified as a mount option,
add "noro" to the list of mount options.  This allows
a read-only mount to be converted to read-write via:
mount -u -o rw

Requested by:	kris
2006-05-14 01:51:38 +00:00
John Baldwin
73dbd3da73 Remove various bits of conditional Alpha code and fixup a few comments. 2006-05-12 05:04:46 +00:00
Benno Rice
26ab616fdc Add a new kernel config option, VERBOSE_SYSINIT.
When porting FreeBSD to a new platform, one of the more useful things to do is
get mi_startup() to let you know which SYSINIT it's up to.  Most people tend to
whack a printf in the SYSINIT loop to print the address of the function it's
about to call.  Going one better, jhb made a version that uses DDB to look up
the name of the function and print that instead.  This version is essentially
his with the addition of some ifdeffery to make it optional and to allow it to
work (although using only the function address, not the symbol) if you forgot
to enable DDB.

All the cool bits by:	jhb
Approved by:		scottl, rink, cognet, imp
2006-05-12 02:01:38 +00:00
Poul-Henning Kamp
99ab8292c7 Remove more straggling CPU_ macro references 2006-05-11 17:53:26 +00:00
David Xu
005efcdb0e Use wakeup_one to avoid thundering herd.
Tested by: kris
2006-05-09 13:00:46 +00:00
David Xu
759ccccadb Use a dedicated mutex to protect aio queues, the movation is to reduce
lock contention with other parts.
2006-05-09 00:10:11 +00:00
Tor Egge
11991ab418 Call vn_finished_write() before calling the coredump handler which will
indirectly call vn_start_write() as necessary for each write.
2006-05-07 22:50:22 +00:00
Tor Egge
d302786c87 Temporarily unlock vnode for new image being executed to avoid lock order
reversals that can lead to deadlocks.  Normally vn_close(), namei() or vrele()
should not be called while holding vnode locks.
2006-05-05 20:25:05 +00:00
Pawel Jakub Dawidek
643df192de vn_start_write()/vn_finished_write() is not needed here, because
vn_start_write() is always called earlier in the code path and calling
the function recursively may lead to a deadlock.

Confirmed by:	tegge
MFC after:	2 weeks
2006-04-29 21:57:38 +00:00
Kris Kennaway
cef31ff7d9 Lock giant when assigning ni_vp and keep vfslocked state valid.
Committed for:	jeff
2006-04-29 07:13:49 +00:00
Pawel Jakub Dawidek
122410eea2 vn_start_write() is called only when v_type != VCHR, so corresponding
vn_finished_write() should also be called only then.

BTW. I fixed two functions here: vn_rdwr() and vn_write(). The latter seems
to be unused.

MFC after:	3 weeks
2006-04-28 21:54:05 +00:00
Robert Watson
3bf14fd5e9 Also check use_pty in the ptmx clone lookup; this means that when ptmx
support is turned off using the sysctl, we no longer even allow the
ptmx device to be looked up.

Foot provided by:	peter
2006-04-28 21:39:57 +00:00
Marcel Moolenaar
8f405ed335 Remove the puc-specific hacks. The puc(4) driver now properly uses
the rman(9) interface.
2006-04-28 21:23:09 +00:00
Jeff Roberson
6ca9fcc586 - Add a BO_NEEDSGIANT flag to the bufobj. This flag forces all child
buffers to go on the buf daemon's DIRTYGIANT queue.
 - Set BO_NEEDSGIANT on ffs's devvp since the ffs_copyonwrite handler
   runs in the context of the buf daemon and may require Giant.
2006-04-28 01:05:31 +00:00
Jeff Roberson
4b5b86816c - Consistently track ni_dvp and ni_vp with dvfslocked and vfslocked rather
than trying to optimize it into a single lock.  This adds more calls to
   lock giant with non smpsafe filesystems but is the only way to reliably
   hold the correct lock.
 - Remove an invalid assert in the mountedhere case in lookup and fix the
   code to properly deal with the scenario.  We can actually have a lookup
   that returns dp == dvp with mountedhere set with certain unmount races.

Tested by:	kris
Reported by:	kris/mohans
2006-04-28 00:59:48 +00:00
John-Mark Gurney
5c06d111b8 back out for now... revert ccpu to being kern.ccpu... 2006-04-27 17:57:59 +00:00
John-Mark Gurney
c71ce6a445 move remaining sysctl into the kern.sched tree... 2006-04-26 19:42:38 +00:00
John Baldwin
ae110b53d1 Add some new commands to hopefully make it easier to diagnose lock-related
problems in ddb:
- "show threadchain [thread]" will start with the specified thread (or the
  current kdb thread by default) and show it's state.  If it is blocked on
  a lock, it will find the owner of the lock and show its state, etc.
- "show allchains" will find all of the threads that are blocked on a
  lock (but do not have any threads blocked on a lock they hold) and show
  the resulting thread chain.
- "show lockchain <lock>" takes a pointer to a lock_object (such as a
  mutex or rwlock).  If there is a turnstile for that lock, then it will
  display all the threads blocked on the lock.  In addition, for each
  thread blocked on the lock, it will display any contested locks they
  hold, and recurse on those locks to show any threads blocked on those
  locks, etc.
2006-04-25 20:28:17 +00:00
John Baldwin
de833b7c0c Use db_lookup_thread() to lookup the thread for the passed in address
and change 'show locks' to only list the locks for a given thread
rather than for all the threads in the process containing a specified
thread.
2006-04-25 20:24:23 +00:00
Marius Strobl
fa63296aba Remove last vestiges of sab(4). 2006-04-25 19:43:53 +00:00
Robert Watson
102ea03373 Extend getsock() to return the struct file flags read while holding the
file lock, in the style of fgetsock().

Modify accept1() to use getsock() instead of fgetsock(), relying on the
file descriptor reference rather than an acquired socket reference to
prevent the listen socket from being destroyed during accept().  This
avoids additional reference count operations, which should improve
performance, and also avoids accept1() operating on a socket whose file
descriptor has been torn down, which may have resulted in protocol
shutdown starting.

MFC after:	3 months
2006-04-25 11:48:16 +00:00
Maxim Konovalov
481f8fe85f Inherit LOCAL_CREDS option from listen socket for sockets returned
by accept(2).

PR:		kern/90644
Submitted by:	Andrey Simonenko
OK'ed by:	mdodd
Tested by:	NetBSD regress/sys/kern/unfdpass/unfdpass.c
MFC after:	1 month
2006-04-24 19:09:33 +00:00
Marcel Moolenaar
845652dd28 MFp4: Add the ipend() method to the serdev I/F to allow umbrella
drivers to obtain pending interrupt status from subordinate
	drivers.
2006-04-23 22:12:39 +00:00
Robert Watson
0cec9959e8 Assert that sockets passed into soabort() not be SQ_COMP or SQ_INCOMP,
since that removal should have been done a layer up.

MFC after:	3 months
2006-04-23 18:15:54 +00:00
Robert Watson
28ea180136 Add missing 'not' to SQ_COMP comment.
MFC after:	3 months
2006-04-23 15:37:23 +00:00
Robert Watson
6ca35d4b81 Move handling of SQ_COMP exception case in sofree() to the top of the
function along with the remainder of the reference checking code.  Move
comment from body to header with remainder of comments.  Inclusion of a
socket in a completed connection queue counts as a true reference, and
should not be handled as an under-documented edge case.

MFC after:	3 months
2006-04-23 15:33:38 +00:00
John Baldwin
f9ab2f134f Print td_name instead of p_comm if td_name is non-empty for
'show turnstile' and 'show sleepq'.
2006-04-21 20:40:43 +00:00
Paul Saab
95f16c1e2c Don't try to kill embryonic processes in killpg1(). This prevents
a race condition between fork() and kill(pid,sig) with pid < 0 that
can cause a kernel panic.

Submitted by:	up
MFC after:	3 weeks
2006-04-21 19:26:21 +00:00
Paul Saab
4f590175b7 Allow for nmbclusters and maxsockets to be increased via sysctl.
An eventhandler is used to update all the various zones that depend
on these values.
2006-04-21 09:25:40 +00:00
John-Mark Gurney
be4db476a6 const'ify resource_spec to note that we won't be changing anything while
releasing resources... also, NULL out the resources as we free them...
2006-04-20 01:44:16 +00:00
Warner Losh
0385d64761 r_spare1 and r_spare2 aren't needed. They aren't used. They can't be
accessed from outside of subr_rman.c.  Remove them.

Reviewed by: jmg (in theory)
2006-04-19 21:25:55 +00:00
John Baldwin
fea3efe5bf Implement rw_try_upgrade() and rw_downgrade(). rw_try_upgrade() makes a
single attempt at upgrading a read lock to a write lock, and rw_downgrade()
converts curthread's write lock into a read lock.
2006-04-19 21:06:52 +00:00
Wojciech A. Koszek
5884c1a098 'owner' is not used without SMP. Fix kernel build for such kernel
configurations.

Approved by:	jhb
2006-04-18 20:32:42 +00:00
John Baldwin
efa86db61d Adaptively spin before blocking on the turnstile if an rwlock is write
locked.  In general the adaptive spinning is similar to the same code
for mutexes with some extra trickiness in rw_wunlock_hard().  Specifically,
even though both wait bits might be set and we might have a turnstile with
at least one waiting thread, there might not be any threads blocked on the
queue we are not waking up (they might all be spinning), and we should
only preserve the waiting flag for the queue we aren't waking up if there
are in fact threads blocked on that queue.  Secondly, there might not be
any threads blocked on the queue we have chosen to waken threads from
(there might only be threads blocked on the other queue and the threads
for this queue are all spinning) in which case we disown the turnstile
instead of doing a braodcast and unpend.
2006-04-18 18:27:54 +00:00
John Baldwin
f1a4b852dc - Bring back turnstile_empty() which can check to see if an individual
queue on a turnstile is empty.
- Add a turnstile_disown() function that allows a thread to give up
  ownership of a turnstile w/o waking up any waiters.
2006-04-18 18:16:54 +00:00
Xin LI
4207c279d4 In vfs_hash_get(): mount point should never be changed
so explicitly constify the mp parameter.

Reviewed by:	phk
2006-04-18 08:05:08 +00:00
John Baldwin
38bf165fa1 - Add a rw_wowner() macro that just returns the owner of a write lock and
use it in places that only care about the write owner instead of
  rw_owner() as a baby step towards limited read-lock owner.
- Tidy the code that sets the WAITER flag bits to not duplicate a test
  around the atomic operation and the KTR trace in both of the lock
  functions.
2006-04-17 21:11:01 +00:00
John Baldwin
32553b153e Add a 'show sleepqueue' alias for 'show sleepq' in DDB. 2006-04-17 20:16:32 +00:00
John Baldwin
964b557211 Trim trailing whitespace. 2006-04-17 20:14:51 +00:00
John Baldwin
2971c36136 Add a new module_file() function that returns the linker_file_t associated
with a given module_t.  I use this in some the MOD_LOAD event handler for
some test kernel modules to ask the kernel linker to look up the linker
sets in my test modules. (I use linker sets to generate the list of
possible events that I then signal to execute via a sysctl.  On non-amd64,
ld(8) would resolve the entire linker set, but on amd64 I have to ask the
kernel linker to do it for me, and having the kernel linker do it works on
all archs.)
2006-04-17 19:44:44 +00:00
John Baldwin
0f180a7cce Change msleep() and tsleep() to not alter the calling thread's priority
if the specified priority is zero.  This avoids a race where the calling
thread could read a snapshot of it's current priority, then a different
thread could change the first thread's priority, then the original thread
would call sched_prio() inside msleep() undoing the change made by the
second thread.  I used a priority of zero as no thread that calls msleep()
or tsleep() should be specifying a priority of zero anyway.

The various places that passed 'curthread->td_priority' or some variant
as the priority now pass 0.
2006-04-17 18:20:38 +00:00
John-Mark Gurney
e98b5a89de remove duplicate sizeof vnode entry (debug.sizeof.vnode already existed)...
move ncsize into debug.sizeof and rename to namecache...
2006-04-16 18:38:30 +00:00
Scott Long
bb141be10a Take a better stab at making this compile. 2006-04-15 18:54:56 +00:00
Scott Long
83bc5d54c8 Take a stab at making this compile. 2006-04-15 18:04:04 +00:00
John Baldwin
76447e5618 Mark the thread pointer used during an adaptive spin volatile so that the
compiler doesn't decide to cache td_state.  Cachine the state would cause
the spinning thread to not notice when the owning thread stopped executing
(if it was preempted for example) which could result in livelock.
2006-04-14 19:51:50 +00:00
John Baldwin
a29b4f6eec Drop the kqueue global mutex as soon as we are finished with it rather
than keeping it locked until we exit the function to optimize the case
where the lock would be dropped and later reacquired.  The optimization
was broken when kevent's were moved from UFS to VFS and the knote list
lock for a vnode kevent became the lockmgr vnode lock.  If one tried
to use a kqueue that contained events for a kqueue fd followed by a vnode,
then the kq global lock would end up being held when the vnode lock was
acquired which could result in sleeping with a mutex held (and subsequent
panics) if the vnode lock was contested.

Reviewed by:	jmg
Tested by:	ps (on 6.x)
MFC after:	3 days
2006-04-14 14:27:28 +00:00
David Xu
cfd6f8cd6c Clear TDF_SINTR in sleepq_resume_thread, also sleepq_catch_signal does
not need to clear it now, this should fix panic when msleep is recursivly
called. Patch is slightly adjusted after review.

Reviewed by: jhb
Tested by: Csaba Henk, csaba-ml at creo.hu
MFC after: 3 days
2006-04-13 23:29:25 +00:00
John Baldwin
9477358d00 Turn on ithread_destroy() and call it from intr_event_destroy() to tear
down an interrupt event's associated thread (if it has one).
2006-04-13 17:29:04 +00:00
Christian S.J. Peron
d5e5634075 Kill the last Giant acquisition in the exit(2) code. This Giant acquisition
doesn't appear to be protecting anything. Most of consumers funsetownlst(9)
do not appear to be picking up Giant anywhere. This was originally a part
of my Giant exit(2) clean up revision 1.272 but I thought it was a good idea
to leave it out until we were able to analyze it better.

Tested by:	kris
MFC after:	3 weeks
2006-04-10 14:07:28 +00:00
Pawel Jakub Dawidek
0909f38a3c On shutdown try to turn off all swap devices. This way GEOM providers are
properly closed on shutdown.

Requested by:	ru
Reviewed by:	alc
MFC after:	2 weeks
2006-04-10 10:03:41 +00:00
David Xu
e631cff309 Use proc lock to prevent a thread from exiting, Giant was no longer used to
protect thread list.
2006-04-10 04:55:59 +00:00
Robert Watson
d37b79a00f Remove UNIX domain socket raw socket support. This feature is documented
as being undocumented in Stevens, and was broken in 1997 during network
stack infrastructure work.  It is the one remaining (and incorrect)
direct protocol reference to raw_usrreq.pru_attach; this is incorrect
because the raw socket code assumes that raw_uattach is called only after
the protocol has allocated a PCB.

MFC after:	3 months
2006-04-09 16:29:47 +00:00
Marcel Moolenaar
07c8931358 Add the scc_hwmtx spin mutex, defined by scc(4). 2006-04-07 22:15:54 +00:00
John-Mark Gurney
1c4ca5e5fe spell unlock correctly, this is relatively minor as it's rare someone would
provide a lock method, and want the default unlock, but it is a bug...

PR:		95356
Submitted by:	Stephen Corteselli
MFC after:	3 days
2006-04-07 17:21:27 +00:00
Jeff Roberson
b53bf1269c - VFS_LOCK_GIANT when recycling a vnode via getnewvnode. We may be
recycling for an unrelated filesystem.  I really don't like potentially
   acquiring giant in the context of a giantless filesystem but there
   are reasonable objections to removing the recycling from this path.

Sponsored by:	Isilon Systems, Inc.
2006-04-04 06:46:10 +00:00
Jeff Roberson
4b24e4210e - Properly check against B_DELWRI and B_NEEDSGIANT. This check was
incorrectly written and caused some !NEEDSGIANT buffers to be put in
   the NEEDSGIANT queue.

Sponsored by:	Isilon Systems, Inc.
2006-04-04 06:44:21 +00:00
Marcel Moolenaar
39eb1d1263 Increment kdb_active after we stopped the other CPUs and decrement
kdb_active before we restart them. This avoids false positives on
restarted CPUs when they test for kdb_active while kdb_trap() is
still finishing up.
2006-04-04 00:40:20 +00:00
Marcel Moolenaar
bfcdefd8aa Eliminate HAVE_STOPPEDPCBS. On ia64 the PCPU holds a pointer to the
PCB in which the context of stopped CPUs is stored. To access this
PCB from KDB, we introduce a new define, called KDB_STOPPEDPCB. The
definition, when present, lives in <machine/kdb.h> and abstracts
where MD code saves the context. Define KDB_STOPPEDPCB on i386,
amd64, alpha and sparc64 in accordance to previous code.
2006-04-03 22:51:47 +00:00
Peter Wemm
b9eee07e36 Remove the unused sva and eva arguments from pmap_remove_pages(). 2006-04-03 21:16:10 +00:00
Marcel Moolenaar
5991a4f811 In kdb_trap(), change the type of the local variable 'intr' from int
to register_t, as intr_disable() returns the latter and register_t
may be wider than int.

Pointed out by: marius@
2006-04-03 20:55:52 +00:00
Marcel Moolenaar
2fae8f5aed Replace critical_enter() and critical_exit() in kdb_trap() with
intr_disable() and intr_restore() resp. Previously, critical
regions would have interrupts disabled, but that was changed.
Consequently, the debugger could run with interrupts enabled.
This could cause problems for the low-level console code where
received characters would trigger an interrupt that causes
the interrupt handler to read the character instead of the
cngetc() function.
2006-04-03 17:48:09 +00:00
John-Mark Gurney
5e6125891f mask out any action when copying the flags from the event to the knote..
Pointed out by:	Václav Haisman
Submitted by:	Dan Nelson (slightly modifed patch)
MFC after:	3 days
2006-04-01 20:15:39 +00:00
Robert Watson
bc725eafc7 Chance protocol switch method pru_detach() so that it returns void
rather than an error.  Detaches do not "fail", they other occur or
the protocol flags SS_PROTOREF to take ownership of the socket.

soclose() no longer looks at so_pcb to see if it's NULL, relying
entirely on the protocol to decide whether it's time to free the
socket or not using SS_PROTOREF.  so_pcb is now entirely owned and
managed by the protocol code.  Likewise, no longer test so_pcb in
other socket functions, such as soreceive(), which have no business
digging into protocol internals.

Protocol detach routines no longer try to free the socket on detach,
this is performed in the socket code if the protocol permits it.

In rts_detach(), no longer test for rp != NULL in detach, and
likewise in other protocols that don't permit a NULL so_pcb, reduce
the incidence of testing for it during detach.

netinet and netinet6 are not fully updated to this change, which
will be in an upcoming commit.  In their current state they may leak
memory or panic.

MFC after:	3 months
2006-04-01 15:42:02 +00:00
Robert Watson
ac45e92ff2 Change protocol switch pru_abort() API so that it returns void rather
than an int, as an error here is not meaningful.  Modify soabort() to
unconditionally free the socket on the return of pru_abort(), and
modify most protocols to no longer conditionally free the socket,
since the caller will do this.

This commit likely leaves parts of netinet and netinet6 in a situation
where they may panic or leak memory, as they have not are not fully
updated by this commit.  This will be corrected shortly in followup
commits to these components.

MFC after:      3 months
2006-04-01 15:15:05 +00:00
Robert Watson
fa4c5373ce Add comment to accept1() that it should use getsock() instead of fgetsock()
to avoid additional mutex operations, and also to avoid use of soref/sorele
which are now not preferred.

MFC after:	3 months
2006-04-01 11:14:56 +00:00
Robert Watson
197b35d717 Mark fgetsock() and fputsock() as depcrecated: callers should rely on
the file descriptor reference, rather than paying additional lock
operations to acquire a socket reference from the file descriptor.
This will also help to ensure that file descriptor based socket
requests are not delivered to a socket after close.  Most consumers
have already been converted to this model.

MFC after:	3 months
2006-04-01 11:09:54 +00:00
Robert Watson
7f689de232 Assert so->so_pcb is NULL in sodealloc() -- the protocol state should not
be present at this point.  We will eventually remove this assert because
the socket layer should never look at so_pcb, but for now it's a useful
debugging tool.

MFC after:	3 months
2006-04-01 10:45:52 +00:00
Robert Watson
220c1357ed Add a somewhat sizable comment documenting the semantics of various kernel
socket calls relating to the creation and destruction of sockets.  This
will eventually form the foundation of socket(9), but is currently in too
much flux to do so.

MFC after:	3 months
2006-04-01 10:43:02 +00:00
Jeff Roberson
0af2472199 - Add an assert to vgone. It is illegal to call vgone without a reference
to the vnode.  Without a reference the vnode will never be vdestroy'd
   and the memory will never be reclaimed.

Sponsored by:	Isilon Systems, Inc.
2006-03-31 23:39:26 +00:00
Jeff Roberson
ba5eb429e3 - When there are dangling vnodes at unmount print them before we panic.
Sponsored by:	Isilon Systems, Inc.
2006-03-31 23:38:15 +00:00
Jeff Roberson
3bbd6d8ae6 - Release the references acquired by VOP_GETWRITEMOUNT and vfs_getvfs().
Discussed with:	tegge
Tested by:	kris
Sponsored by:	Isilon Systems, Inc.
2006-03-31 03:54:20 +00:00
Jeff Roberson
94bc95db3c - Hold a reference from the time vfs_busy starts until vfs_unbusy is
called.
 - vfs_getvfs has to return a reference to prevent the returned mountpoint
   from changing identities.
 - Release references acquired via vfs_getvfs.

Discussed with:	tegge
Tested by:	kris
Sponsored by:	Isilon Systems, Inc.
2006-03-31 03:53:25 +00:00
Jeff Roberson
c5fcce21c5 - GETWRITEMOUNT now returns a referenced mountpoint to prevent its
identity from changing.  This is possible now that mounts are not freed.

Discussed with:	tegge
Tested by:	kris
Sponsored by:	Isilon Systems, Inc.
2006-03-31 03:52:24 +00:00
Jeff Roberson
a218edceb2 - Allocate mounts from a uma zone that uses UMA_ZONE_NOFREE to prevent
mount memory from being reclaimed.  This resolves a number of race
   conditions described in vfs_default.c and introduced with the
   VFS_LOCK_GIANT macros.
 - Let the mtx and lock remain valid after the mount structure has been
   freed by using init and fini calls.  Technically fini will never be
   called but is included for completeness.
 - Consistently use lockmgr directly rather than lockmgr to lock and
   vfs_unbusy to unlock.

Discussed with:	tegge
Tested by:	kris
Sponsored by:	Isilon Systems, Inc.
2006-03-31 03:49:51 +00:00
Jeff Roberson
fdf86b2dcd - LK_RETRY means nothing when passed to VOP_LOCK. Call vn_lock instead.
- Move the vn_lock of the dvp until after we've unbusied the filesystem
   to avoid a LOR with the mount point lock.
 - In the v_mountedhere while loop we acquire a new instance of giant each
   time through without releasing the first.  This would cause us to leak
   Giant.

Sponsored by:	Isilon Systems, Inc.
2006-03-31 02:59:23 +00:00
Jeff Roberson
084d64ac21 - Add the B_NEEDSGIANT flag which is only set if the vnode that owns a buf
requires Giant.  It is set in bgetvp and cleared in brelvp.
 - Create QUEUE_DIRTY_GIANT for dirty buffers that require giant.
 - In the buf daemon, only grab giant when processing QUEUE_DIRTY_GIANT and
   only if we think there are buffers in that queue.

Sponsored by:	Isilon Systems, Inc.
2006-03-31 02:56:30 +00:00
Sam Leffler
00537061dd fixup error handling in taskqueue_start_threads: check for kthread_create
failing, print a message when we fail for some reason as most callers do
not check the return value (e.g. 'cuz they're called from SYSINIT)

Reviewed by:	scottl
MFC after:	1 week
2006-03-30 23:06:59 +00:00
Pawel Jakub Dawidek
177a987379 Fix a panic on sparc64 related to inproper aligment - we cannot assume,
that 'unsigned char *' argument is 4 byte aligned.

MFC after:	3 days
2006-03-30 18:45:50 +00:00
Marcel Moolenaar
6174e6ed12 Add scc(4), a driver for serial communications controllers. These
controllers typically have multiple channels and support a number
of serial communications protocols. The scc(4) driver is itself
an umbrella driver that delegates the control over each channel
and mode to a subordinate driver (like uart(4)).
The scc(4) driver supports the Siemens SAB 82532 and the Zilog
Z8530 and replaces puc(4) for these devices.
2006-03-30 18:33:22 +00:00
Paul Saab
fbb273bc05 Properly support for FreeBSD 4 32bit System V shared memory.
Submitted by:	peter
Obtained from:	Yahoo!
MFC after:	3 weeks
2006-03-30 07:42:32 +00:00
John Baldwin
4b3b0413d2 Always explicitly panic in propogate_priority() if we try to propogate
a lock's priority to a sleeping thread.  When we panic, dump a stack
trace of the thread that is asleep if DDB is compiled into the kernel
just before calling panic().  This is much more informative and useful
for debugging than the current behavior of getting a page fault and not
having an easy way of determining which thread caused the original problem.

MFC after:	1 week
2006-03-29 23:24:55 +00:00
John-Mark Gurney
4e095bc045 hold the list lock over the f_event and KNOTE_ACTIVATE calls... This closes
a race where data could come in before we clear the INFLUX flag, and get
skipped over by knote (and hence never be activated, though it should of
been)...

Found by:	glebius & co.
Reviewed by:	glebius
MFC after:	3 days
2006-03-29 18:15:30 +00:00
John Baldwin
33f19bee6f - Conditionalize Giant around VFS operations for ALQ, ktrace, and
generating a coredump as the result of a signal.
- Fix a bug where we could leak a Giant lock if vn_start_write() failed
  in coredump().

Reported by:	jmg (2)
2006-03-28 21:30:22 +00:00
John Baldwin
11178ee4c1 Conditionalize locking of Giant for VFS in acct(2). We already
conditionally acquired Giant in the other parts of the accounting code.
2006-03-28 21:26:59 +00:00
John Baldwin
861dab08e7 Change vn_open() to honor the MPSAFE flag in the passed in nameidata object
and use that instead of testing fdidx against -1 to determine if it should
release Giant if Giant was locked due to the requested file residing on a
non-MPSAFE VFS.

Discussed with:	jeff
2006-03-28 21:22:08 +00:00
Dag-Erling Smørgrav
867c089bc7 Revert previous commit at davidxu's insistance. Instead, use __DECONST
(argh!) and rearrange the prototypes to make it clear that _umtx_op()
is not deprecated.
2006-03-28 14:32:38 +00:00
Dag-Erling Smørgrav
b3efbabe87 The undocumented and deprecated system call _umtx_op() takes two pointer
arguments.  The first one is never used (all callers pass in 0); the
second is sometimes used to pass in a struct timespec * which is used as
a timeout and never modified.  Constify that argument so callers can pass
a const struct timespec * without jumping through hoops.
2006-03-28 09:18:34 +00:00
Alan Cox
7c8dcf2def Use NET_LOCK_GIANT() and VFS_LOCK_GIANT() instead of unconditionally
acquiring Giant in kern_sendfile().

Guard against the forced reclamation of a vnode in kern_sendfile().

Discussed with: jeff
Reviewed by: tegge
MFC after: 3 weeks
2006-03-27 04:23:16 +00:00
Robert Watson
63b01ffd34 Add a sysctl, regression.sonewconn_earlytest, which when options
REGRESSION is enabled, allows user space to dictate that sonewconn()
should skip it's "skip the hard work" check to see if the listen
queue is full, and instead proceed with allocation of a socket and
trimming of the overflowed queue.  This makes it easier to test the
queue overflow logic.

MFC after:	1 month
2006-03-26 22:44:37 +00:00
Joseph Koshy
49874f6ea3 MFP4: Support for profiling dynamically loaded objects.
Kernel changes:

  Inform hwpmc of executable objects brought into the system by
  kldload() and mmap(), and of their removal by kldunload() and
  munmap().  A helper function linker_hwpmc_list_objects() has been
  added to "sys/kern/kern_linker.c" and is used by hwpmc to retrieve
  the list of currently loaded kernel modules.

  The unused `MAPPINGCHANGE' event has been deprecated in favour
  of separate `MAP_IN' and `MAP_OUT' events; this change reduces
  space wastage in the log.

  Bump the hwpmc's ABI version to "2.0.00".  Teach hwpmc(4) to
  handle the map change callbacks.

  Change the default per-cpu sample buffer size to hold
  32 samples (up from 16).

  Increment __FreeBSD_version.

libpmc(3) changes:

  Update libpmc(3) to deal with the new events in the log file; bring
  the pmclog(3) manual page in sync with the code.

pmcstat(8) changes:

  Introduce new options to pmcstat(8): "-r" (root fs path), "-M"
  (mapfile name), "-q"/"-v" (verbosity control).  Option "-k" now
  takes a kernel directory as its argument but will also work with
  the older invocation syntax.

  Rework string handling in pmcstat(8) to use an opaque type for
  interned strings.  Clean up ELF parsing code and add support for
  tracking dynamic object mappings reported by a v2.0.00 hwpmc(4).

  Report statistics at the end of a log conversion run depending
  on the requested verbosity level.

Reviewed by:	jhb, dds (kernel parts of an earlier patch)
Tested by:	gallatin (earlier patch)
2006-03-26 12:20:54 +00:00
David Xu
dbbccfe923 1. Move code for scanning pending I/O from aio_fsync to aio_aqueue,
it has less overhead.
2. Avoid scheduling task if maximum number of I/O threads is reached.
2006-03-24 00:50:06 +00:00
David Xu
177e987e63 Regenerate. 2006-03-23 08:48:37 +00:00
David Xu
99eee864ad Implement aio_fsync() syscall. 2006-03-23 08:46:42 +00:00
Pawel Jakub Dawidek
96c0381f5c Destroy "bip" bio in error case.
Found by:	Coverity Prevent analysis tool
Coverity ID:	795
MFC after:	3 days
2006-03-22 00:42:41 +00:00
Jeff Roberson
bacb51fb67 - Remove explicit giant acquires and replace it with VFS_LOCK_GIANT.
Sponsored by:	Isilon Systems, Inc.
2006-03-22 00:00:05 +00:00
Jeff Roberson
77c79550af - Remove explicit calls to lock and unlock Giant and replace them with
VFS_LOCK_GIANT/VFS_UNLOCK_GIANT calls.  This completely removes Giant
   acquisition in the syscall path for ffs.

Bug fix to kern_fhstatfs from:	Todd Miller <Todd.Miller@sparta.com>
Sponsored by:	Isilon Systems, Inc.
2006-03-21 23:58:37 +00:00
David Xu
bf1a322061 Rethink it a bit, if there is a STOP flag, don't bother to resume other
threads.
2006-03-21 10:05:15 +00:00
David Xu
568b4ebbcc Because JOB control has higher priority than single threading in
thread_suspend_check(), call thread_stopped() to report SIGCHLD
if there is JOB control in progress.
2006-03-21 08:41:15 +00:00
Tor Egge
41d7199b9d Remove unused leaked debug function prototype. 2006-03-21 01:04:24 +00:00
Christian S.J. Peron
2ed4894a26 Restore fd optimization with a few minor tweaks, to quote tegge:
"fdinit() fails to initialize newfdp->fd_fd.fd_lastfile to -1.  This breaks
fdcopy() which will incorrectly set newfdp->fd_freefile to 1 if no files are
open and the last file descriptor marked as unused for fdp was 0.  This later
causes descriptor 0 to be unavailable in newfdp when the optimization is
enabled.

When the last file descriptor previously marked as used is nonzero and marked
as unused, fdunused() incorrectly sets fdp->fd_lastfile to fd - 1 due to
fd_last_used() returning (size - 1).  This hides the problem that breaks the
optimization."

This allows us to keep the optimization, while un-breaking it.

This is a RELENG_6 candidate.

PR:		kern/87208
MFC after:	1 week
Submitted by:	tegge
2006-03-20 00:13:47 +00:00
Tor Egge
7de3839d0d Let snapshots make a copy of old contents for all buffers taking part in a
cluster instead of just the first buffer.

Delay buf_start() calls until snapshots have a copy of old content.

PR:		kern/93942
2006-03-19 21:43:36 +00:00
Tor Egge
d50ef66d03 Don't call vn_finished_write() if vn_start_write() failed. 2006-03-19 20:43:07 +00:00
Jeff Roberson
e44270a781 - Correct an assert in vop_rename_pre. fdvp may be locked if it is either
the target directory or file.  This case should fail in the filesystem
   anyway and perhaps kern_rename() should catch it.

Sponsored by:	Isilon Systems, Inc.
2006-03-19 20:14:46 +00:00
Christian S.J. Peron
30bacc08e0 Back out fd optimization introduced in revision 1.280 as it appears to be
really breaking things. Simple "close(0); dup(fd)" does not return descriptor
"0" in some cases. Further, this change also breaks some MAC interactions with
mac_execve_will_transition().  Under certain circumstances, fdcheckstd() can
be called in execve(2) causing an assertion that checks to make sure that
stdin, stdout and stderr reside at indexes 0, 1 and 2 in the process fd table
to fail, resulting in a kernel panic when INVARIANTS is on.

This should also kill the "dup(2) regression on 6.x" show stopper item on the
6.1-RELEASE TODO list.

This is a RELENG_6 candidate.

PR:		kern/87208
Silence from:	des
MFC after:	1 week
2006-03-18 23:27:21 +00:00
Robert Watson
4d4b555efa Modify UNIX domain sockets to guarantee, and assume, that so_pcb is always
defined for an in-use socket.  This allows us to eliminate countless tests
of whether so_pcb is non-NULL, eliminating dozens of error cases.  For
now, retain the call to sotryfree() in the uipc_abort() path, but this
will eventually move to soabort().

These new assumptions should be largely correct, and will become more so
as the socket/pcb reference model is fixed.  Removing the notion that
so_pcb can be non-NULL is a critical step towards further fine-graining
of the UNIX domain socket locking, as the so_pcb reference no longer
needs to be protected using locks, instead it is a property of the socket
life cycle.
2006-03-17 13:52:57 +00:00
Alan Cox
41634e2e8d Correct two vm object reference leaks in error cases.
Submitted by: davidxu
2006-03-16 08:51:59 +00:00
Robert Watson
92c07a345e Change soabort() from returning int to returning void, since all
consumers ignore the return value, soabort() is required to succeed,
and protocols produce errors here to report multiple freeing of the
pcb, which we hope to eliminate.
2006-03-16 07:03:14 +00:00
David Xu
795a11d049 Fix a race between file operations and rfork(RFCFDG) by parking
all other threads at user boundary, the race can crash kernel
under stress testing.

Reviewed by: jhb
MFC after: 3 days
2006-03-15 23:24:14 +00:00
Sam Leffler
47e2996e8b promote fast ipsec's m_clone routine for public use; it is renamed
m_unshare and the caller can now control how mbufs are allocated

Reviewed by:	andre, luigi, mlaier
MFC after:	1 week
2006-03-15 21:11:11 +00:00
Poul-Henning Kamp
590487078f Disable the "cputick increased..." message now that the dust has settled. 2006-03-15 20:22:32 +00:00
Alexander Leidinger
a8f47039c7 Fix memory leak introduced in previous revision.
Discussed with:	phk
2006-03-15 19:23:08 +00:00
Robert Watson
93709ad0be As with socket consumer references (so_count), make sofree() return
without GC'ing the socket if a strong protocol reference to the socket
is present (SS_PROTOREF).
2006-03-15 12:45:35 +00:00
David Xu
e170bfda56 1. Count last time slice, this intends to fix
"calcru: runtime went backwards" bug for threaded process.
2. Add comment about possible logical problem with scheduler.

MFC after: 3 days
2006-03-14 04:00:21 +00:00
John-Mark Gurney
45e0d0aa30 spell pdata correctly, we now will only dump maxlen of each mbuf in the
chain, instead of the entire mbuf...  This should probably be reworked
so that it prints at max maxlen bytes for the entire chain...
2006-03-14 00:22:10 +00:00
Ruslan Ermilov
936ddefcd6 The mount(8) manpage says: "In case of conflicting options being
specified, the rightmost option takes effect."  Fix code to obey
this.  This makes e.g. "mount -r /usr" or "mount -ar" actually
mount file systems read-only.
2006-03-13 14:58:37 +00:00
David Xu
28e989e9ca Remove unused code. 2006-03-13 10:37:25 +00:00
Christian S.J. Peron
a19fd0e766 Make sure that we are adding a path token to the audit record in open(2).
Do this by making sure we are using the AUDITVNODE1 mask in the namei flags.

Obtained from:	TrustedBSD Project
2006-03-11 17:14:05 +00:00
Poul-Henning Kamp
272601f8f0 Go over calcru and friends once more.
Reintroduce the monotonicity for the normal case and make the two
special cases behave in what is belived to be the most sensible fasion.
2006-03-11 10:48:19 +00:00
Tor Egge
ca2fa80767 Block secondary writes while expunging active unlinked files.
Fix detection of active unlinked files by checking VI_OWEINACT and
VI_DOINGINACT in addition to v_usecount.

Defer inactive handling for unlinked files if the file system is mostly
suspended (secondary writes being blocked).

Perform deferred inactive handling after the file system is resumed.
2006-03-11 01:08:37 +00:00
Jung-uk Kim
0d84d9ebb5 Implement printf 'X' conversion for both libstand and kernel. 2006-03-09 22:37:34 +00:00
Poul-Henning Kamp
fef527ee73 Oops, forgot newline. 2006-03-09 09:44:10 +00:00
Poul-Henning Kamp
0f038c05ea Add slop to "backwards" cpu accounting messages, 3 usec or 1% whichever
triggers.

This should eliminate all the trivial messages which result from minor
increases in cpu_tick frequency.

Machines which don't du cpu clock fiddling shouldn't issue "backwards"
messages now.

Laptops and other machines where the initial estimate of cputicks may be
waaaay off will still issue warnings.
2006-03-09 09:33:17 +00:00
Poul-Henning Kamp
6cda760f09 silence cpu_tick calibration and notice only (under bootverbose)
when the frequency increases.
2006-03-09 09:30:33 +00:00
Poul-Henning Kamp
c8d7706e75 Ignore kenv strings which overflow the room we have, rather than pretend
we have room for them.
2006-03-09 09:29:41 +00:00
David Xu
7b8d5e4865 Remove _STOPEVENT call, it is already called in issignal, simplify
code for SIGKILL signal.
2006-03-09 08:31:51 +00:00
Tor Egge
791dd2fade Use vn_start_secondary_write() and vn_finished_secondary_write() as a
replacement for vn_write_suspend_wait() to better account for secondary write
processing.

Close race where secondary writes could be started after ffs_sync() returned
but before the file system was marked as suspended.

Detect if secondary writes or softdep processing occurred during vnode sync
loop in ffs_sync() and retry the loop if needed.
2006-03-08 23:43:39 +00:00
Stephan Uphoff
68ff3c2445 Fix exec_map resource leaks.
Tested by: kris@
2006-03-08 20:21:54 +00:00
Andre Oppermann
a7bd90ef93 Properly handle the case when the packet secondary zone can't allocate
further mbuf clusters to attach to mbufs.

Reported by:	kris
Tested by:	kris
Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-03-08 14:05:38 +00:00
John Baldwin
88ca07e79a Style nit. 2006-03-07 22:17:26 +00:00
John Baldwin
67c0796ca3 For consistency sake, use >= MINCLSIZE rather than > MINCLSIZE to determine
whether or not to allocate a full mbuf cluster rather than just a plain
mbuf when adding on additional mbufs in m_getm().  In practice, there wasn't
any resulting mem trashing since m_getm() doesn't ever allocate an mbuf with
a packet header, and MINCLSIZE is the available payload in an mbuf with a
header rather than the available payload in a plain mbuf.

Discussed with: 	andre (lightly)
2006-03-07 21:31:20 +00:00
Poul-Henning Kamp
fccfcfba00 Add missing cast. 2006-03-04 06:07:26 +00:00
Poul-Henning Kamp
5b51d1de62 More detailed logging if timestepwarnings are enabled. 2006-03-04 06:06:43 +00:00
Paul Saab
6308f39da8 use strlcpy in cvtstatfs and copy_statfs instead of bcopy to ensure
the copied strings are properly terminated.

bzero the statfs32 struct in copy_statfs.
2006-03-04 00:09:09 +00:00
Paul Saab
45d48bdad5 Fix bug in malloc_uninit():
Releasing items from the mt_zone can not be done by a simple
uma_zfree() call since mt_zone is allocated with the UMA_ZONE_MALLOC
flag. Use uma_zfree_arg instead and supply the slab.

This bug caused panics in low memory situations on unloading kernel
modules containing MALLOC_DEFINE(..) statements.

Submitted by:	ups
2006-03-03 22:36:52 +00:00
Paul Saab
6815739e00 Don't truncate f_mntfromname & f_mntonname to 16 characters when
translating statfs into ostatfs.  This allows 4.x binaries making
statfs calls to work on 6.x.
2006-03-03 07:20:54 +00:00
Marcus Alves Grando
b4130b8ae0 - Print message about cpufreq and timecounter TSC
Approved by:	njl
MFC after:	1 day
2006-03-03 02:06:04 +00:00
Tor Egge
3b582b4e72 Eliminate a deadlock when creating snapshots. Blocking vn_start_write() must
be called without any vnode locks held.  Remove calls to vn_start_write() and
vn_finished_write() in vnode_pager_putpages() and add these calls before the
vnode lock is obtained to most of the callers that don't already have them.
2006-03-02 22:13:28 +00:00
Tor Egge
b983aac762 Don't try to show marker nodes. 2006-03-02 21:31:15 +00:00
David Xu
3dfcaad667 Add signal set sq_kill to sigqueue structure, the member saves all
signals sent by kill() syscall, without this, a signal sent by
sigqueue() can cause a signal sent by kill() to be lost.
2006-03-02 14:06:40 +00:00
Poul-Henning Kamp
301af28a06 Suffer a little bit of math every 16 second and tighten calibration of
cpu_ticks to the low side of PPM.
2006-03-02 08:09:46 +00:00
Jeff Roberson
eb2ea10590 - Move softdep from using a global worklist to per-mount worklists. This
has many positive effects including improved smp locking, reducing
   interdependencies between mounts that can lead to deadlocks, etc.
 - Add the softdep worklist and various counters to the ufsmnt structure.
 - Add a mount pointer to the workitem and remove mount pointers from the
   various structures derived from the workitem as they are now redundant.
 - Remove the poor-man's semaphore protecting softdep_process_worklist and
   softdep_flushworklist.  Several threads may now process the list
   simultaneously.
 - Add softdep_waitidle() to block the thread until all pending
   dependencies being operated on by other threads have been flushed.
 - Use softdep_waitidle() in unmount and snapshots to block either
   operation until the fs is stable.
 - Remove softdep worklist processing from the syncer and move it into the
   softdep_flush() thread.  This thread processes all softdep mounts
   once each second and when it is called via the new softdep_speedup()
   when there is a resource shortage.  This removes the softdep hook
   from the kernel and various hacks in header files to support it.

Reviewed by/Discussed with:	tegge, truckman, mckusick
Tested by:	kris
2006-03-02 05:50:23 +00:00
David Xu
80452384e6 Regenerate. 2006-03-01 06:49:38 +00:00
David Xu
61d3a4efc2 Let kernel POSIX timer code and mqueue code to use integer as a resource
handle, the timer_t and mqd_t types will be a pointer which userland
will define it.
2006-03-01 06:29:34 +00:00
Paul Saab
fa545f434c Fix 32bit sendfile by implementing kern_sendfile so that it takes
the header and trailers as iovec arguments instead of copying them
in inside of sendfile.

Reviewed by:	jhb
MFC after:	3 weeks
2006-02-28 19:39:18 +00:00
Gleb Smirnoff
73bb09f2d0 One more grammar nit.
Submitted by:	ru
2006-02-27 07:22:32 +00:00
David Xu
27b8220d12 1. Remove aio entry from lists earlier in aio_free_entry,
so other threads can not see it if we unlock the proc
   lock (this can happen in knlist_delete).  Don't do wakeup,
   it is not necessary.

2. Decrease kaio_buffer_count in biohelper rather than
   doing it in aio_bio_done_notify.

3. In aio_bio_done_notify, don't send notification if KAIO_RUNDOWN
   was set, because the process is already in single thread mode.

4. Use assignment to initialize aiothreadflags.

5. AIOCBLIST_RUNDOWN is not useful, axe the code using it.

6. use LIO_NOP instead of zero.
2006-02-26 12:56:23 +00:00
Gleb Smirnoff
fcf9061858 Fix several typos and trim spaces at eol.
PR:		kern/93759
Submitted by:	Antoine Brodin <antoine.brodin laposte.net>
2006-02-26 11:44:28 +00:00
Scott Long
6ec6fb9bc6 Always print a newline char at the end of the line. 2006-02-25 16:20:22 +00:00
John Baldwin
b36f458861 Use the recently added msleep_spin() function to simplify the
callout_drain() logic.  We no longer need a separate non-spin mutex to
do sleep/wakeup with, instead we can now just use the one spin mutex to
manage all the callout functionality.
2006-02-23 19:13:12 +00:00
David Xu
7e0221a251 1. Refine kern_sigtimedwait() to remove redundant code.
2. Fix a bug, if thread got a SIGKILL signal, call sigexit() to kill
   its process.

MFC after: 3 days
2006-02-23 09:24:19 +00:00
David Xu
7c9a98f15b Code cleanup, simply compare with curproc. 2006-02-23 05:50:55 +00:00
Jeff Roberson
8febcfb92f - Use vfs_ref/rel to protect a mountpoint from going away while VFS_STATFS
is being called.  Be sure to grab the ref before we unlock the vnode to
   prevent the mount from disappearing.

Tested by:	kris
2006-02-23 05:18:07 +00:00
Jeff Roberson
a1db11fc40 - Release the mount ref once the vnode has been recycled rather than once
the last reference is dropped.  I forgot that vnodes can stick around
   for a very long time until processes discover that they are dead.  This
   means that a vnode reference is not sufficient to keep the mount
   referenced and even more code will be required to ref mount points.

Discovered by:	kris
2006-02-23 05:15:37 +00:00
David Xu
dc94f5e383 Move comments to more accurate place. 2006-02-23 03:42:17 +00:00
David Xu
c008d51784 Fix a sleep queue race for KSE thread.
Reviewed by: jhb
2006-02-23 00:13:58 +00:00
John Baldwin
daad1cd74d Fixup some comments. Mutexes's are locked, not entered for several years
now and msleep blocks threads rather than processes.
2006-02-22 20:46:10 +00:00
John Baldwin
06ad42b2f7 Close some races between procfs/ptrace and exit(2):
- Reorder the events in exit(2) slightly so that we trigger the S_EXIT
  stop event earlier.  After we have signalled that, we set P_WEXIT and
  then wait for any processes with a hold on the vmspace via PHOLD to
  release it.  PHOLD now KASSERT()'s that P_WEXIT is clear when it is
  invoked, and PRELE now does a wakeup if P_WEXIT is set and p_lock drops
  to zero.
- Change proc_rwmem() to require that the processing read from has its
  vmspace held via PHOLD by the caller and get rid of all the junk to
  screw around with the vmspace reference count as we no longer need it.
- In ptrace() and pseudofs(), treat a process with P_WEXIT set as if it
  doesn't exist.
- Only do one PHOLD in kern_ptrace() now, and do it earlier so it covers
  FIX_SSTEP() (since on alpha at least this can end up calling proc_rwmem()
  to clear an earlier single-step simualted via a breakpoint).  We only
  do one to avoid races.  Also, by making the EINVAL error for unknown
  requests be part of the default: case in the switch, the various
  switch cases can now just break out to return which removes a _lot_ of
  duplicated PRELE and proc unlocks, etc.  Also, it fixes at least one bug
  where a LWP ptrace command could return EINVAL with the proc lock still
  held.
- Changed the locking for ptrace_single_step(), ptrace_set_pc(), and
  ptrace_clear_single_step() to always be called with the proc lock
  held (it was a mixed bag previously).  Alpha and arm have to drop
  the lock while the mess around with breakpoints, but other archs
  avoid extra lock release/acquires in ptrace().  I did have to fix a
  couple of other consumers in kern_kse and a few other places to
  hold the proc lock and PHOLD.

Tested by:	ps (1 mostly, but some bits of 2-4 as well)
MFC after:	1 week
2006-02-22 18:57:50 +00:00
John Baldwin
54690b5679 Don't do a PHOLD() in kthread_create() w/o a matching PRELE() in
kthread_exit().  Rather than add the missing PRELE() I chose to just
axe the PHOLD() since it was redundant with the P_SYSTEM flag.

MFC after:	1 week
2006-02-22 17:21:45 +00:00
John Baldwin
8f95fc2481 Various style and comment fixes.
Submitted by:	bde
2006-02-22 16:58:48 +00:00
Wayne Salamon
bc5504b942 Add pathname and/or vnode argument auditing for the following system calls:
quotactl, statfs, fstatfs, fchdir, chdir, chroot, open, mknod, mkfifo,
link, symlink, undelete, unlink, access, eaccess, stat, lstat, pathconf,
readlink, chflags, lchflags, fchflags, chmod, lchmod, fchmod, chown,
lchown, fchown, utimes, lutimes, futimes, truncate, ftruncate, fsync,
rename, mkdir, rmdir, getdirentries, revoke, lgetfh, getfh, extattrctl,
extattr_set_file, extattr_set_link, extattr_get_file, extattr_get_link,
extattr_delete_file, extattr_delete_link, extattr_list_file, extattr_list_link.

In many cases the pathname and vnode auditing is done within namei lookup
instead of directly in the system call.

Audit the remaining arguments to these system calls:
fstatfs, fchdir, open, mknod, chflags, lchflags, fchflags, chmod, lchmod,
fchmod, chown, lchown, fchown, futimes, ftruncate, fsync, mkdir,
getdirentries.
2006-02-22 16:04:20 +00:00
Jeff Roberson
c5dcb84008 - Revert r1.406 until a solution can be found that doesn't break nfs. The
statfs handler in nfs will lock vnodes which may lead to deadlock or
   recursion.

Found by:	kris
Pointy hat to:	me
2006-02-22 09:52:25 +00:00
Jeff Roberson
a4aeaefe5a - We can not hold a vnode lock while we do a lookup. Search for and load
modules prior to looking up the directory which we will cover to avoid
   this problem in mount.
 - We must hold the coveredvp locked before we can busy the mountpoint to
   prevent a lock order reversal with the vfs_busy() in lookup which holds
   the directory lock prior to doing a vfs_busy().  The directory lock is
   required to safely clear the v_mountedhere field on the directory.

MFC After:	1 week
2006-02-22 06:29:55 +00:00
Jeff Roberson
8a7cd2fdfb - Grab a mnt ref in vfs_busy() before dropping the interlock. This will
prevent the mount point from going away while we're waiting on the lock.
   The ref does not need to persist once we have the lock because the
   lock prevents the mount point from being unmounted.

MFC After:	1 week
2006-02-22 06:20:12 +00:00
Jeff Roberson
05b6a20a66 - Hold the vnode used in the statfs related functions until we're done with
the VFS_STATFS call to prevent the mount from disappearing while we're
   stating.
 - Convert these routines to use MPSAFE namei semantics.

MFC After:	1 week
2006-02-22 06:19:08 +00:00
David Xu
ba0360b135 Abstract function mqfs_create_node() to create a mqueue node. 2006-02-22 02:38:25 +00:00
David Xu
ad8de0f243 If block size is zero, use normal file operations to do I/O,
this eliminates a divided-by-zero fault.

Recommended by: phk
2006-02-22 00:05:12 +00:00
John Baldwin
bd106be404 Move the ruadd() in kern_exit() to save our final stats in our child
stats even further down in exit1() so that it includes the runtime and
tick counts from the final time slice for the dying thread.

Reviewed by:	phk
2006-02-21 21:48:42 +00:00
John Baldwin
6fc6433ecd Split calcru() back into a calcru1() function shared with calccru() and
a calcru() wrapper that passes a local rusage_ext on the stack that is
a snapshot to do the calculations on.  Now we can pass p->p_crux to
calcru1() in calccru() again which fixes the issues with runtime going
backwards messages when dead processes are harvested by init.

Reviewed by:	phk
Tested by:	Stefan Ehmann shoesoft at gmx dot net
2006-02-21 21:47:46 +00:00
Andre Oppermann
80444f8803 The sysctls kern.ipc.[max_linkhdr|max_protohdr|max_hdr|max_datalen]
can't be changed from userland.  Make them read-only and provide
descriptions.

kern.ipc.max_datalen must never be less than one byte.  Enforce this
with a panic in net_init_domain().

Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-02-18 17:16:18 +00:00
Andre Oppermann
ec63cb90a3 Replace the 4k fixed sized jumbo mbuf clusters with PAGE_SIZE sized
jumbo mbuf clusters.  To make the variable size clear they are named
MJUMPAGESIZE.

Having jumbo clusters with the native PAGE_SIZE is more useful than
a fixed 4k size according the device driver writers using this API.

The 9k and 16k jumbo mbuf clusters remain unchanged.

Requested by:	glebius, gallatin
Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-02-17 14:14:15 +00:00
Andre Oppermann
a4684d742d Make sysctl_msec_to_ticks(SYSCTL_HANDLER_ARGS) generally available instead
of being private to tcp_timer.c.

Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-02-16 15:40:36 +00:00
David Xu
94f0972bec Fix a long standing race between sleep queue and thread
suspension code. When a thread A is going to sleep, it calls
sleepq_catch_signals() to detect any pending signals or thread
suspension request, if nothing happens, it returns without
holding process lock or scheduler lock, this opens a race
window which allows thread B to come in and do process
suspension work, however since A is still at running state,
thread B can do nothing to A, thread A continues, and puts
itself into actually sleeping state, but B has never seen it,
and it sits there forever until B is woken up by other threads
sometimes later(this can be very long delay or never
happen). Fix this bug by forcing sleepq_catch_signals to
return with scheduler lock held.
Fix sleepq_abort() by passing it an interrupted code, previously,
it worked as wakeup_one(), and the interruption can not be
identified correctly by sleep queue code when the sleeping
thread is resumed.
Let thread_suspend_check() returns EINTR or ERESTART, so sleep
queue no longer has to use SIGSTOP as a hack to build a return
value.

Reviewed by:	jhb
MFC after:	1 week
2006-02-15 23:52:01 +00:00
Wayne Salamon
085a0d43ca Audit the arguments to the ptrace(2) system call.
Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-02-14 01:18:31 +00:00
Wayne Salamon
bfd7575a39 Audit the arguments to the kill(2) and killpg(2) system calls.
Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-02-14 01:17:03 +00:00
David Xu
d8267df729 In order to speed up process suspension on MP machine, send IPI to
remote CPU. While here, abstract thread suspension code into a function
called sig_suspend_threads, the function is called when a process received
a STOP signal.
2006-02-13 03:16:55 +00:00
Robert Watson
13f322c2fc Improve consistency of return() style.
MFC after:	3 days
2006-02-12 15:00:27 +00:00
Poul-Henning Kamp
e8444a7e6f CPU time accounting speedup (step 2)
Keep accounting time (in per-cpu) cputicks and the statistics counts
in the thread and summarize into struct proc when at context switch.

Don't reach across CPUs in calcru().

Add code to calibrate the top speed of cpu_tickrate() for variable
cpu_tick hardware (like TSC on power managed machines).

Don't enforce monotonicity (at least for now) in calcru.  While the
calibrated cpu_tickrate ramps up it may not be true.

Use 27MHz counter on i386/Geode.

Use TSC on amd64 & i386 if present.

Use tick counter on sparc64
2006-02-11 09:33:07 +00:00
David Xu
42925630b6 Test before modifying p_sflag to avoid unconditionally cache line
ping-pong on SMP.
2006-02-10 14:59:16 +00:00
David Xu
71b7afb2b4 Call thread_stopped in thr_exit to notify parent that the child process
is now fully stopped, this was already in kse_exit().
2006-02-10 03:34:29 +00:00
Poul-Henning Kamp
eb2da9a51f Simplify system time accounting for profiling.
Rename struct thread's td_sticks to td_pticks, we will need the
other name for more appropriately named use shortly.  Reduce it
from uint64_t to u_int.

Clear td_pticks whenever we enter the kernel instead of recording
its value as reference for userret().  Use the absolute value of
td->pticks in userret() and eliminate third argument.
2006-02-08 08:09:17 +00:00
Poul-Henning Kamp
5b1a8eb397 Modify the way we account for CPU time spent (step 1)
Keep track of time spent by the cpu in various contexts in units of
"cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t
only when somebody wants to inspect the numbers.

For now "cputicks" are still derived from the current timecounter
and therefore things should by definition remain sensible also on
SMP machines.  (The main reason for this first milestone commit is
to verify that hypothesis.)

On slower machines, the avoided multiplications to normalize timestams
at every context switch, comes out as a 5-7% better score on the
unixbench/context1 microbenchmark.  On more modern hardware no change
in performance is seen.
2006-02-07 21:22:02 +00:00
John Baldwin
222fdf4bff Provide some anti-footshooting. Don't allow the user to set the interval
for acctwatch() runs to be negative or zero as this could result in either
a possible hang (or panic if INVARIANTS is on).  Previously the accounting
code handled the <= 0 case by calling acctwatch on every clock tick (eww!)
due to an implementation detail of callout_reset().  (Tick counts of
<= 0 are converted to 1).

MFC after:	3 days
2006-02-07 18:59:47 +00:00
John Baldwin
505a14934e - Add a kthread to periodically call acctwatch() when accounting is active
instead of calling acctwatch() from softclock.  The acctwatch() function
  needs to hold an sx lock and also makes a VFS call, and neither of these
  are good things (or safe) to do from a callout.  The kthread only exists
  and is running when accounting is turned on; it is started and stopped
  as needed.  I didn't run acctwatch() via the thread taskqueue at Robert's
  request as he was worried that if the accounting file was over NFS the
  VFS_STAT() calls might stall other work on the taskqueue.
- Add an acct_disable() function to take care of closing the accounting
  vnode and cleaning up so we don't duplicate the same code in two
  different places.

MFC after:	3 days
2006-02-07 16:04:03 +00:00
John Baldwin
8917b8d28c - Always call exec_free_args() in kern_execve() instead of doing it in all
the callers if the exec either succeeds or fails early.
- Move the code to call exit1() if the exec fails after the vmspace is
  gone to the bottom of kern_execve() to cut down on some code duplication.
2006-02-06 22:06:54 +00:00
John Baldwin
809f984b21 Add a kern_eaccess() function and use it to implement xenix_eaccess()
rather than kern_access().

Suggested by:	rwatson
2006-02-06 22:00:53 +00:00
John Baldwin
934ba9b2cf - Move the wakeup() for exiting kthreads out of exit1() and into
kthread_exit() as that is cleaner and less obscured.  It also does the
  wakeup sooner.
- Add some comments to kthread_exit().
2006-02-06 21:56:13 +00:00
John Baldwin
2c9d9d392a We don't need the proc lock to check P_KTHREAD on curthread since it is
only set before the kthread starts executing and is never cleared.
2006-02-06 21:54:47 +00:00
Olivier Houchard
2a3b10658d rwlock expects the struct thread to be aligned on 8 bytes, so make sure
thread0 is.
2006-02-06 16:03:10 +00:00
Jeff Roberson
04f6d3effa - Add a ref count to the mount structure. Sleep for up to 3 seconds in
vfs_mount_destroy waiting for this ref to hit 0.  We don't print an
   error if we are rebooting as the root mount always retains some refernces
   by init proc.
 - Acquire a mnt ref for every vnode allocated to a mount point.  Drop this
   ref only once vdestroy() has been called and the mount has been freed.
 - No longer NULL the v_mount pointer in delmntque() so that we may release
   the ref after vgone() has been called.  This allows us to guarantee
   that the mount point structure will be valid until the last vnode has
   lost its last ref.
 - Fix a few places that rely on checking v_mount to detect recycling.

Sponsored by:	Isilon Systems, Inc.
MFC After:	1 week
2006-02-06 10:19:50 +00:00
Jeff Roberson
2f0bca553a - Don't check v_mount for NULL to determine if a vnode has been recycled.
Use the more appropriate VI_DOOMED flag instead.

Sponsored by:	Isilon Systems, Inc.
MFC After:	1 week
2006-02-06 10:15:27 +00:00
Jeff Roberson
36a52c3cae - Add the global 'rebooting' variable that is used to detect when
boot() has been called.

Sponsored by:	Isilon Systems, Inc.
MFC After:	1 week
2006-02-06 10:12:00 +00:00
David Xu
ea8e65b0fa Add members pl_sigmask and pl_siglist into ptrace_lwpinfo to get lwp's
signal mask and pending signals.
2006-02-06 09:41:56 +00:00
Robert Watson
9653775b18 Regenerate. 2006-02-06 02:00:32 +00:00
Robert Watson
c983324ef5 Prefer AUE_FOO audit identifiers to AUE_O_FOO, which are largely left
over from the Darwin implementation.

When we implement a system call as a wrapper to sysctl(), audit it as
AUE_SYSCTL.  This leads to greater compatibility with Solaris audit
trails as sysctl() argument tokens are not the same as the ones for
the originaly system calls (i.e., setdomainname()).

Replace references to AUE_ events that are equivilent to AUE_NULL with
AUE_NULL.  In the case of process signal configuration, this is
because these events do not require auditing.

Move from the Darwin spelling of getsockopt() to the FreeBSD/Solaris
one.

Audit nmount().

Obtained from:	TrustedBSD Project
2006-02-06 02:00:06 +00:00
Robert Watson
89964dd284 When exiting a thread, submit any pending record. Today, we don't
audit thread exit, but should that happen, this will prevent
unhappiness, as the thread exit system call will never return, and
hence not commit the record.

Pointed out by/with:	cognet
Obtained from:		TrustedBSD Project
2006-02-06 01:51:08 +00:00
Wayne Salamon
2f8a46d5ff Audit the arguments (user/group IDs) for the system calls that set these IDs.
Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-02-06 00:32:33 +00:00
Wayne Salamon
ad20c8f325 Audit the args to rfork(), and the child PID for all fork system calls.
Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-02-06 00:28:50 +00:00
Wayne Salamon
de3007e8f3 Audit the pid being requested in wait4().
Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-02-06 00:19:09 +00:00
Wayne Salamon
a750d0b2a2 Add auditing of arguments to the close() and fstat() system calls. Much more
argument auditing yet to come, for remaining system calls in this file.

Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-02-05 23:57:32 +00:00
Robert Watson
00c28d9678 On process exit, audit the return value of the process, and commit the
record immediately, as this system call never returns.

Obtained from:	TrustedBSD Project
2006-02-05 21:08:25 +00:00
Robert Watson
6e8525ce84 When GC'ing a thread, assert that it has no active audit record.
This should not happen, but with this assert, brueffer and I would
not have spent 45 minutes trying to figure out why he wasn't
seeing audit records with the audit version in CVS.

Obtained from:	TrustedBSD Project
2006-02-05 21:06:09 +00:00
Robert Watson
95fea57c65 Add AUDITVNODE[12] flags to namei(), which cause namei() to audit path
and vnode attribute information for looked up vnodes during the lookup
operation.  This will allow consumers of namei() to specify that this
information be added to the in-process audit record.

Submitted by:	wsalamon
Obtained from:	TrustedBSD Project
2006-02-05 15:42:01 +00:00
David Xu
25c926f1b0 Regenerate. 2006-02-05 02:23:41 +00:00
David Xu
9e7d72246f Implement thr_set_name to set a name for thread.
Reviewed by: julian
2006-02-05 02:18:46 +00:00
David Xu
7f96995ebd Create childproc_jobstate function to report job control state, this
also fixes a bug in childproc_continued which ignored PS_NOCLDSTOP.
2006-02-04 14:10:57 +00:00
David Xu
a99f7ca21e Axe unused code. 2006-02-04 06:36:39 +00:00
John Baldwin
37f84a6018 Add a comment. 2006-02-03 21:09:40 +00:00
John Baldwin
b0864d13ab Sort includes. 2006-02-03 16:37:55 +00:00
Robert Watson
59428b0bad In fchdir(), Giant must be separately acquired and dropped if the old
vnode is from a file system that is not MPSAFE, as vrele() expects
Giant to be held when it is called on a non-MPSAFE vnode.

Spotted by:	kris
Tested by:	glebius
2006-02-03 15:42:16 +00:00
Robert Watson
d7bd3313e2 Regenerate. 2006-02-03 11:51:19 +00:00
Robert Watson
62646c07f6 Assign audit event identifiers to many system calls.
Much work by:	wsalamon
Obtained from:	TrustedBSD Project
2006-02-03 11:48:37 +00:00
Tor Egge
c78226329a For low memory situations, non-VMIO buffers didnt't release pages back to
the system when brelse() was called with B_RELBUF set on the buffer.  This
could be a problem when the system was low on memory, had many buffers on
QUEUE_EMPTYKVA and started to traverse directories.  For each getnewbuf(),
pages were allocated from the system, driving the free reserve downwards.
For each brelse(), the system put the buffer on QUEUE_CLEAN, with B_INVAL
set.

This commit changes the semantics of B_RELBUF to also free pages from
non-VMIO buffers.

Reviewed by:	alc
2006-02-02 21:37:39 +00:00
Olivier Houchard
56db7f4cc6 Don't destroy the slave /dev entry until someone figures out why devfs seems
to behave badly when we do so.
2006-02-02 20:35:45 +00:00
John Baldwin
f6b457923d Whitespace fix.
Submitted by:	Wojciech A. Koszek <dunstan at zsno ids czest pl>
2006-02-02 20:14:52 +00:00
Jeff Roberson
68ce4375c4 - textvp may have been from a different mountpoint than ndp->ni_vp and
we may need to acquire giant to vrele it.

Found by:	mjacob
MFC After:	3 days
2006-02-02 08:39:39 +00:00
Robert Watson
06f2859f6d Regenerate. 2006-02-02 01:45:01 +00:00
Robert Watson
35d29f5091 Map audit-related system calls to audit event identifiers.
Much work by:	wsalamon
Obtained from:	TrustedBSD Project
2006-02-02 01:44:30 +00:00
Robert Watson
fcf7f27a36 Hook up audit to fork() and exit() events. These changes manage the
audit state on processes, not auditing of these events.

Much work by:	wsalamon
Obtained from:	TrustedBSD Project
2006-02-02 01:32:58 +00:00
Robert Watson
3683665bbd Hook up audit to the initial process creation events (proc0, proc1).
Much help from:	wsalamon
Obtained from:	TrustedBSD Project
2006-02-02 01:16:31 +00:00
Robert Watson
911b84b08d Add new fields to process-related data structures:
- td_ar to struct thread, which holds the in-progress audit record during
  a system call.

- p_au to struct proc, which holds per-process audit state, such as the
  audit identifier, audit terminal, and process audit masks.

In the earlier implementation, td_ar was added to the zero'd section of
struct thread.  In order to facilitate merging to RELENG_6, it has been
moved to the end of the data structure, requiring explicit
initalization in the thread constructor.

Much help from:	wsalamon
Obtained from:	TrustedBSD Project
2006-02-02 00:37:05 +00:00