Commit Graph

9201 Commits

Author SHA1 Message Date
Robert Watson
4d4b555efa Modify UNIX domain sockets to guarantee, and assume, that so_pcb is always
defined for an in-use socket.  This allows us to eliminate countless tests
of whether so_pcb is non-NULL, eliminating dozens of error cases.  For
now, retain the call to sotryfree() in the uipc_abort() path, but this
will eventually move to soabort().

These new assumptions should be largely correct, and will become more so
as the socket/pcb reference model is fixed.  Removing the notion that
so_pcb can be non-NULL is a critical step towards further fine-graining
of the UNIX domain socket locking, as the so_pcb reference no longer
needs to be protected using locks, instead it is a property of the socket
life cycle.
2006-03-17 13:52:57 +00:00
Alan Cox
41634e2e8d Correct two vm object reference leaks in error cases.
Submitted by: davidxu
2006-03-16 08:51:59 +00:00
Robert Watson
92c07a345e Change soabort() from returning int to returning void, since all
consumers ignore the return value, soabort() is required to succeed,
and protocols produce errors here to report multiple freeing of the
pcb, which we hope to eliminate.
2006-03-16 07:03:14 +00:00
David Xu
795a11d049 Fix a race between file operations and rfork(RFCFDG) by parking
all other threads at user boundary, the race can crash kernel
under stress testing.

Reviewed by: jhb
MFC after: 3 days
2006-03-15 23:24:14 +00:00
Sam Leffler
47e2996e8b promote fast ipsec's m_clone routine for public use; it is renamed
m_unshare and the caller can now control how mbufs are allocated

Reviewed by:	andre, luigi, mlaier
MFC after:	1 week
2006-03-15 21:11:11 +00:00
Poul-Henning Kamp
590487078f Disable the "cputick increased..." message now that the dust has settled. 2006-03-15 20:22:32 +00:00
Alexander Leidinger
a8f47039c7 Fix memory leak introduced in previous revision.
Discussed with:	phk
2006-03-15 19:23:08 +00:00
Robert Watson
93709ad0be As with socket consumer references (so_count), make sofree() return
without GC'ing the socket if a strong protocol reference to the socket
is present (SS_PROTOREF).
2006-03-15 12:45:35 +00:00
David Xu
e170bfda56 1. Count last time slice, this intends to fix
"calcru: runtime went backwards" bug for threaded process.
2. Add comment about possible logical problem with scheduler.

MFC after: 3 days
2006-03-14 04:00:21 +00:00
John-Mark Gurney
45e0d0aa30 spell pdata correctly, we now will only dump maxlen of each mbuf in the
chain, instead of the entire mbuf...  This should probably be reworked
so that it prints at max maxlen bytes for the entire chain...
2006-03-14 00:22:10 +00:00
Ruslan Ermilov
936ddefcd6 The mount(8) manpage says: "In case of conflicting options being
specified, the rightmost option takes effect."  Fix code to obey
this.  This makes e.g. "mount -r /usr" or "mount -ar" actually
mount file systems read-only.
2006-03-13 14:58:37 +00:00
David Xu
28e989e9ca Remove unused code. 2006-03-13 10:37:25 +00:00
Christian S.J. Peron
a19fd0e766 Make sure that we are adding a path token to the audit record in open(2).
Do this by making sure we are using the AUDITVNODE1 mask in the namei flags.

Obtained from:	TrustedBSD Project
2006-03-11 17:14:05 +00:00
Poul-Henning Kamp
272601f8f0 Go over calcru and friends once more.
Reintroduce the monotonicity for the normal case and make the two
special cases behave in what is belived to be the most sensible fasion.
2006-03-11 10:48:19 +00:00
Tor Egge
ca2fa80767 Block secondary writes while expunging active unlinked files.
Fix detection of active unlinked files by checking VI_OWEINACT and
VI_DOINGINACT in addition to v_usecount.

Defer inactive handling for unlinked files if the file system is mostly
suspended (secondary writes being blocked).

Perform deferred inactive handling after the file system is resumed.
2006-03-11 01:08:37 +00:00
Jung-uk Kim
0d84d9ebb5 Implement printf 'X' conversion for both libstand and kernel. 2006-03-09 22:37:34 +00:00
Poul-Henning Kamp
fef527ee73 Oops, forgot newline. 2006-03-09 09:44:10 +00:00
Poul-Henning Kamp
0f038c05ea Add slop to "backwards" cpu accounting messages, 3 usec or 1% whichever
triggers.

This should eliminate all the trivial messages which result from minor
increases in cpu_tick frequency.

Machines which don't du cpu clock fiddling shouldn't issue "backwards"
messages now.

Laptops and other machines where the initial estimate of cputicks may be
waaaay off will still issue warnings.
2006-03-09 09:33:17 +00:00
Poul-Henning Kamp
6cda760f09 silence cpu_tick calibration and notice only (under bootverbose)
when the frequency increases.
2006-03-09 09:30:33 +00:00
Poul-Henning Kamp
c8d7706e75 Ignore kenv strings which overflow the room we have, rather than pretend
we have room for them.
2006-03-09 09:29:41 +00:00
David Xu
7b8d5e4865 Remove _STOPEVENT call, it is already called in issignal, simplify
code for SIGKILL signal.
2006-03-09 08:31:51 +00:00
Tor Egge
791dd2fade Use vn_start_secondary_write() and vn_finished_secondary_write() as a
replacement for vn_write_suspend_wait() to better account for secondary write
processing.

Close race where secondary writes could be started after ffs_sync() returned
but before the file system was marked as suspended.

Detect if secondary writes or softdep processing occurred during vnode sync
loop in ffs_sync() and retry the loop if needed.
2006-03-08 23:43:39 +00:00
Stephan Uphoff
68ff3c2445 Fix exec_map resource leaks.
Tested by: kris@
2006-03-08 20:21:54 +00:00
Andre Oppermann
a7bd90ef93 Properly handle the case when the packet secondary zone can't allocate
further mbuf clusters to attach to mbufs.

Reported by:	kris
Tested by:	kris
Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-03-08 14:05:38 +00:00
John Baldwin
88ca07e79a Style nit. 2006-03-07 22:17:26 +00:00
John Baldwin
67c0796ca3 For consistency sake, use >= MINCLSIZE rather than > MINCLSIZE to determine
whether or not to allocate a full mbuf cluster rather than just a plain
mbuf when adding on additional mbufs in m_getm().  In practice, there wasn't
any resulting mem trashing since m_getm() doesn't ever allocate an mbuf with
a packet header, and MINCLSIZE is the available payload in an mbuf with a
header rather than the available payload in a plain mbuf.

Discussed with: 	andre (lightly)
2006-03-07 21:31:20 +00:00
Poul-Henning Kamp
fccfcfba00 Add missing cast. 2006-03-04 06:07:26 +00:00
Poul-Henning Kamp
5b51d1de62 More detailed logging if timestepwarnings are enabled. 2006-03-04 06:06:43 +00:00
Paul Saab
6308f39da8 use strlcpy in cvtstatfs and copy_statfs instead of bcopy to ensure
the copied strings are properly terminated.

bzero the statfs32 struct in copy_statfs.
2006-03-04 00:09:09 +00:00
Paul Saab
45d48bdad5 Fix bug in malloc_uninit():
Releasing items from the mt_zone can not be done by a simple
uma_zfree() call since mt_zone is allocated with the UMA_ZONE_MALLOC
flag. Use uma_zfree_arg instead and supply the slab.

This bug caused panics in low memory situations on unloading kernel
modules containing MALLOC_DEFINE(..) statements.

Submitted by:	ups
2006-03-03 22:36:52 +00:00
Paul Saab
6815739e00 Don't truncate f_mntfromname & f_mntonname to 16 characters when
translating statfs into ostatfs.  This allows 4.x binaries making
statfs calls to work on 6.x.
2006-03-03 07:20:54 +00:00
Marcus Alves Grando
b4130b8ae0 - Print message about cpufreq and timecounter TSC
Approved by:	njl
MFC after:	1 day
2006-03-03 02:06:04 +00:00
Tor Egge
3b582b4e72 Eliminate a deadlock when creating snapshots. Blocking vn_start_write() must
be called without any vnode locks held.  Remove calls to vn_start_write() and
vn_finished_write() in vnode_pager_putpages() and add these calls before the
vnode lock is obtained to most of the callers that don't already have them.
2006-03-02 22:13:28 +00:00
Tor Egge
b983aac762 Don't try to show marker nodes. 2006-03-02 21:31:15 +00:00
David Xu
3dfcaad667 Add signal set sq_kill to sigqueue structure, the member saves all
signals sent by kill() syscall, without this, a signal sent by
sigqueue() can cause a signal sent by kill() to be lost.
2006-03-02 14:06:40 +00:00
Poul-Henning Kamp
301af28a06 Suffer a little bit of math every 16 second and tighten calibration of
cpu_ticks to the low side of PPM.
2006-03-02 08:09:46 +00:00
Jeff Roberson
eb2ea10590 - Move softdep from using a global worklist to per-mount worklists. This
has many positive effects including improved smp locking, reducing
   interdependencies between mounts that can lead to deadlocks, etc.
 - Add the softdep worklist and various counters to the ufsmnt structure.
 - Add a mount pointer to the workitem and remove mount pointers from the
   various structures derived from the workitem as they are now redundant.
 - Remove the poor-man's semaphore protecting softdep_process_worklist and
   softdep_flushworklist.  Several threads may now process the list
   simultaneously.
 - Add softdep_waitidle() to block the thread until all pending
   dependencies being operated on by other threads have been flushed.
 - Use softdep_waitidle() in unmount and snapshots to block either
   operation until the fs is stable.
 - Remove softdep worklist processing from the syncer and move it into the
   softdep_flush() thread.  This thread processes all softdep mounts
   once each second and when it is called via the new softdep_speedup()
   when there is a resource shortage.  This removes the softdep hook
   from the kernel and various hacks in header files to support it.

Reviewed by/Discussed with:	tegge, truckman, mckusick
Tested by:	kris
2006-03-02 05:50:23 +00:00
David Xu
80452384e6 Regenerate. 2006-03-01 06:49:38 +00:00
David Xu
61d3a4efc2 Let kernel POSIX timer code and mqueue code to use integer as a resource
handle, the timer_t and mqd_t types will be a pointer which userland
will define it.
2006-03-01 06:29:34 +00:00
Paul Saab
fa545f434c Fix 32bit sendfile by implementing kern_sendfile so that it takes
the header and trailers as iovec arguments instead of copying them
in inside of sendfile.

Reviewed by:	jhb
MFC after:	3 weeks
2006-02-28 19:39:18 +00:00
Gleb Smirnoff
73bb09f2d0 One more grammar nit.
Submitted by:	ru
2006-02-27 07:22:32 +00:00
David Xu
27b8220d12 1. Remove aio entry from lists earlier in aio_free_entry,
so other threads can not see it if we unlock the proc
   lock (this can happen in knlist_delete).  Don't do wakeup,
   it is not necessary.

2. Decrease kaio_buffer_count in biohelper rather than
   doing it in aio_bio_done_notify.

3. In aio_bio_done_notify, don't send notification if KAIO_RUNDOWN
   was set, because the process is already in single thread mode.

4. Use assignment to initialize aiothreadflags.

5. AIOCBLIST_RUNDOWN is not useful, axe the code using it.

6. use LIO_NOP instead of zero.
2006-02-26 12:56:23 +00:00
Gleb Smirnoff
fcf9061858 Fix several typos and trim spaces at eol.
PR:		kern/93759
Submitted by:	Antoine Brodin <antoine.brodin laposte.net>
2006-02-26 11:44:28 +00:00
Scott Long
6ec6fb9bc6 Always print a newline char at the end of the line. 2006-02-25 16:20:22 +00:00
John Baldwin
b36f458861 Use the recently added msleep_spin() function to simplify the
callout_drain() logic.  We no longer need a separate non-spin mutex to
do sleep/wakeup with, instead we can now just use the one spin mutex to
manage all the callout functionality.
2006-02-23 19:13:12 +00:00
David Xu
7e0221a251 1. Refine kern_sigtimedwait() to remove redundant code.
2. Fix a bug, if thread got a SIGKILL signal, call sigexit() to kill
   its process.

MFC after: 3 days
2006-02-23 09:24:19 +00:00
David Xu
7c9a98f15b Code cleanup, simply compare with curproc. 2006-02-23 05:50:55 +00:00
Jeff Roberson
8febcfb92f - Use vfs_ref/rel to protect a mountpoint from going away while VFS_STATFS
is being called.  Be sure to grab the ref before we unlock the vnode to
   prevent the mount from disappearing.

Tested by:	kris
2006-02-23 05:18:07 +00:00
Jeff Roberson
a1db11fc40 - Release the mount ref once the vnode has been recycled rather than once
the last reference is dropped.  I forgot that vnodes can stick around
   for a very long time until processes discover that they are dead.  This
   means that a vnode reference is not sufficient to keep the mount
   referenced and even more code will be required to ref mount points.

Discovered by:	kris
2006-02-23 05:15:37 +00:00
David Xu
dc94f5e383 Move comments to more accurate place. 2006-02-23 03:42:17 +00:00
David Xu
c008d51784 Fix a sleep queue race for KSE thread.
Reviewed by: jhb
2006-02-23 00:13:58 +00:00
John Baldwin
daad1cd74d Fixup some comments. Mutexes's are locked, not entered for several years
now and msleep blocks threads rather than processes.
2006-02-22 20:46:10 +00:00
John Baldwin
06ad42b2f7 Close some races between procfs/ptrace and exit(2):
- Reorder the events in exit(2) slightly so that we trigger the S_EXIT
  stop event earlier.  After we have signalled that, we set P_WEXIT and
  then wait for any processes with a hold on the vmspace via PHOLD to
  release it.  PHOLD now KASSERT()'s that P_WEXIT is clear when it is
  invoked, and PRELE now does a wakeup if P_WEXIT is set and p_lock drops
  to zero.
- Change proc_rwmem() to require that the processing read from has its
  vmspace held via PHOLD by the caller and get rid of all the junk to
  screw around with the vmspace reference count as we no longer need it.
- In ptrace() and pseudofs(), treat a process with P_WEXIT set as if it
  doesn't exist.
- Only do one PHOLD in kern_ptrace() now, and do it earlier so it covers
  FIX_SSTEP() (since on alpha at least this can end up calling proc_rwmem()
  to clear an earlier single-step simualted via a breakpoint).  We only
  do one to avoid races.  Also, by making the EINVAL error for unknown
  requests be part of the default: case in the switch, the various
  switch cases can now just break out to return which removes a _lot_ of
  duplicated PRELE and proc unlocks, etc.  Also, it fixes at least one bug
  where a LWP ptrace command could return EINVAL with the proc lock still
  held.
- Changed the locking for ptrace_single_step(), ptrace_set_pc(), and
  ptrace_clear_single_step() to always be called with the proc lock
  held (it was a mixed bag previously).  Alpha and arm have to drop
  the lock while the mess around with breakpoints, but other archs
  avoid extra lock release/acquires in ptrace().  I did have to fix a
  couple of other consumers in kern_kse and a few other places to
  hold the proc lock and PHOLD.

Tested by:	ps (1 mostly, but some bits of 2-4 as well)
MFC after:	1 week
2006-02-22 18:57:50 +00:00
John Baldwin
54690b5679 Don't do a PHOLD() in kthread_create() w/o a matching PRELE() in
kthread_exit().  Rather than add the missing PRELE() I chose to just
axe the PHOLD() since it was redundant with the P_SYSTEM flag.

MFC after:	1 week
2006-02-22 17:21:45 +00:00
John Baldwin
8f95fc2481 Various style and comment fixes.
Submitted by:	bde
2006-02-22 16:58:48 +00:00
Wayne Salamon
bc5504b942 Add pathname and/or vnode argument auditing for the following system calls:
quotactl, statfs, fstatfs, fchdir, chdir, chroot, open, mknod, mkfifo,
link, symlink, undelete, unlink, access, eaccess, stat, lstat, pathconf,
readlink, chflags, lchflags, fchflags, chmod, lchmod, fchmod, chown,
lchown, fchown, utimes, lutimes, futimes, truncate, ftruncate, fsync,
rename, mkdir, rmdir, getdirentries, revoke, lgetfh, getfh, extattrctl,
extattr_set_file, extattr_set_link, extattr_get_file, extattr_get_link,
extattr_delete_file, extattr_delete_link, extattr_list_file, extattr_list_link.

In many cases the pathname and vnode auditing is done within namei lookup
instead of directly in the system call.

Audit the remaining arguments to these system calls:
fstatfs, fchdir, open, mknod, chflags, lchflags, fchflags, chmod, lchmod,
fchmod, chown, lchown, fchown, futimes, ftruncate, fsync, mkdir,
getdirentries.
2006-02-22 16:04:20 +00:00
Jeff Roberson
c5dcb84008 - Revert r1.406 until a solution can be found that doesn't break nfs. The
statfs handler in nfs will lock vnodes which may lead to deadlock or
   recursion.

Found by:	kris
Pointy hat to:	me
2006-02-22 09:52:25 +00:00
Jeff Roberson
a4aeaefe5a - We can not hold a vnode lock while we do a lookup. Search for and load
modules prior to looking up the directory which we will cover to avoid
   this problem in mount.
 - We must hold the coveredvp locked before we can busy the mountpoint to
   prevent a lock order reversal with the vfs_busy() in lookup which holds
   the directory lock prior to doing a vfs_busy().  The directory lock is
   required to safely clear the v_mountedhere field on the directory.

MFC After:	1 week
2006-02-22 06:29:55 +00:00
Jeff Roberson
8a7cd2fdfb - Grab a mnt ref in vfs_busy() before dropping the interlock. This will
prevent the mount point from going away while we're waiting on the lock.
   The ref does not need to persist once we have the lock because the
   lock prevents the mount point from being unmounted.

MFC After:	1 week
2006-02-22 06:20:12 +00:00
Jeff Roberson
05b6a20a66 - Hold the vnode used in the statfs related functions until we're done with
the VFS_STATFS call to prevent the mount from disappearing while we're
   stating.
 - Convert these routines to use MPSAFE namei semantics.

MFC After:	1 week
2006-02-22 06:19:08 +00:00
David Xu
ba0360b135 Abstract function mqfs_create_node() to create a mqueue node. 2006-02-22 02:38:25 +00:00
David Xu
ad8de0f243 If block size is zero, use normal file operations to do I/O,
this eliminates a divided-by-zero fault.

Recommended by: phk
2006-02-22 00:05:12 +00:00
John Baldwin
bd106be404 Move the ruadd() in kern_exit() to save our final stats in our child
stats even further down in exit1() so that it includes the runtime and
tick counts from the final time slice for the dying thread.

Reviewed by:	phk
2006-02-21 21:48:42 +00:00
John Baldwin
6fc6433ecd Split calcru() back into a calcru1() function shared with calccru() and
a calcru() wrapper that passes a local rusage_ext on the stack that is
a snapshot to do the calculations on.  Now we can pass p->p_crux to
calcru1() in calccru() again which fixes the issues with runtime going
backwards messages when dead processes are harvested by init.

Reviewed by:	phk
Tested by:	Stefan Ehmann shoesoft at gmx dot net
2006-02-21 21:47:46 +00:00
Andre Oppermann
80444f8803 The sysctls kern.ipc.[max_linkhdr|max_protohdr|max_hdr|max_datalen]
can't be changed from userland.  Make them read-only and provide
descriptions.

kern.ipc.max_datalen must never be less than one byte.  Enforce this
with a panic in net_init_domain().

Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-02-18 17:16:18 +00:00
Andre Oppermann
ec63cb90a3 Replace the 4k fixed sized jumbo mbuf clusters with PAGE_SIZE sized
jumbo mbuf clusters.  To make the variable size clear they are named
MJUMPAGESIZE.

Having jumbo clusters with the native PAGE_SIZE is more useful than
a fixed 4k size according the device driver writers using this API.

The 9k and 16k jumbo mbuf clusters remain unchanged.

Requested by:	glebius, gallatin
Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-02-17 14:14:15 +00:00
Andre Oppermann
a4684d742d Make sysctl_msec_to_ticks(SYSCTL_HANDLER_ARGS) generally available instead
of being private to tcp_timer.c.

Sponsored by:	TCP/IP Optimization Fundraise 2005
MFC after:	3 days
2006-02-16 15:40:36 +00:00
David Xu
94f0972bec Fix a long standing race between sleep queue and thread
suspension code. When a thread A is going to sleep, it calls
sleepq_catch_signals() to detect any pending signals or thread
suspension request, if nothing happens, it returns without
holding process lock or scheduler lock, this opens a race
window which allows thread B to come in and do process
suspension work, however since A is still at running state,
thread B can do nothing to A, thread A continues, and puts
itself into actually sleeping state, but B has never seen it,
and it sits there forever until B is woken up by other threads
sometimes later(this can be very long delay or never
happen). Fix this bug by forcing sleepq_catch_signals to
return with scheduler lock held.
Fix sleepq_abort() by passing it an interrupted code, previously,
it worked as wakeup_one(), and the interruption can not be
identified correctly by sleep queue code when the sleeping
thread is resumed.
Let thread_suspend_check() returns EINTR or ERESTART, so sleep
queue no longer has to use SIGSTOP as a hack to build a return
value.

Reviewed by:	jhb
MFC after:	1 week
2006-02-15 23:52:01 +00:00
Wayne Salamon
085a0d43ca Audit the arguments to the ptrace(2) system call.
Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-02-14 01:18:31 +00:00
Wayne Salamon
bfd7575a39 Audit the arguments to the kill(2) and killpg(2) system calls.
Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-02-14 01:17:03 +00:00
David Xu
d8267df729 In order to speed up process suspension on MP machine, send IPI to
remote CPU. While here, abstract thread suspension code into a function
called sig_suspend_threads, the function is called when a process received
a STOP signal.
2006-02-13 03:16:55 +00:00
Robert Watson
13f322c2fc Improve consistency of return() style.
MFC after:	3 days
2006-02-12 15:00:27 +00:00
Poul-Henning Kamp
e8444a7e6f CPU time accounting speedup (step 2)
Keep accounting time (in per-cpu) cputicks and the statistics counts
in the thread and summarize into struct proc when at context switch.

Don't reach across CPUs in calcru().

Add code to calibrate the top speed of cpu_tickrate() for variable
cpu_tick hardware (like TSC on power managed machines).

Don't enforce monotonicity (at least for now) in calcru.  While the
calibrated cpu_tickrate ramps up it may not be true.

Use 27MHz counter on i386/Geode.

Use TSC on amd64 & i386 if present.

Use tick counter on sparc64
2006-02-11 09:33:07 +00:00
David Xu
42925630b6 Test before modifying p_sflag to avoid unconditionally cache line
ping-pong on SMP.
2006-02-10 14:59:16 +00:00
David Xu
71b7afb2b4 Call thread_stopped in thr_exit to notify parent that the child process
is now fully stopped, this was already in kse_exit().
2006-02-10 03:34:29 +00:00
Poul-Henning Kamp
eb2da9a51f Simplify system time accounting for profiling.
Rename struct thread's td_sticks to td_pticks, we will need the
other name for more appropriately named use shortly.  Reduce it
from uint64_t to u_int.

Clear td_pticks whenever we enter the kernel instead of recording
its value as reference for userret().  Use the absolute value of
td->pticks in userret() and eliminate third argument.
2006-02-08 08:09:17 +00:00
Poul-Henning Kamp
5b1a8eb397 Modify the way we account for CPU time spent (step 1)
Keep track of time spent by the cpu in various contexts in units of
"cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t
only when somebody wants to inspect the numbers.

For now "cputicks" are still derived from the current timecounter
and therefore things should by definition remain sensible also on
SMP machines.  (The main reason for this first milestone commit is
to verify that hypothesis.)

On slower machines, the avoided multiplications to normalize timestams
at every context switch, comes out as a 5-7% better score on the
unixbench/context1 microbenchmark.  On more modern hardware no change
in performance is seen.
2006-02-07 21:22:02 +00:00
John Baldwin
222fdf4bff Provide some anti-footshooting. Don't allow the user to set the interval
for acctwatch() runs to be negative or zero as this could result in either
a possible hang (or panic if INVARIANTS is on).  Previously the accounting
code handled the <= 0 case by calling acctwatch on every clock tick (eww!)
due to an implementation detail of callout_reset().  (Tick counts of
<= 0 are converted to 1).

MFC after:	3 days
2006-02-07 18:59:47 +00:00
John Baldwin
505a14934e - Add a kthread to periodically call acctwatch() when accounting is active
instead of calling acctwatch() from softclock.  The acctwatch() function
  needs to hold an sx lock and also makes a VFS call, and neither of these
  are good things (or safe) to do from a callout.  The kthread only exists
  and is running when accounting is turned on; it is started and stopped
  as needed.  I didn't run acctwatch() via the thread taskqueue at Robert's
  request as he was worried that if the accounting file was over NFS the
  VFS_STAT() calls might stall other work on the taskqueue.
- Add an acct_disable() function to take care of closing the accounting
  vnode and cleaning up so we don't duplicate the same code in two
  different places.

MFC after:	3 days
2006-02-07 16:04:03 +00:00
John Baldwin
8917b8d28c - Always call exec_free_args() in kern_execve() instead of doing it in all
the callers if the exec either succeeds or fails early.
- Move the code to call exit1() if the exec fails after the vmspace is
  gone to the bottom of kern_execve() to cut down on some code duplication.
2006-02-06 22:06:54 +00:00
John Baldwin
809f984b21 Add a kern_eaccess() function and use it to implement xenix_eaccess()
rather than kern_access().

Suggested by:	rwatson
2006-02-06 22:00:53 +00:00
John Baldwin
934ba9b2cf - Move the wakeup() for exiting kthreads out of exit1() and into
kthread_exit() as that is cleaner and less obscured.  It also does the
  wakeup sooner.
- Add some comments to kthread_exit().
2006-02-06 21:56:13 +00:00
John Baldwin
2c9d9d392a We don't need the proc lock to check P_KTHREAD on curthread since it is
only set before the kthread starts executing and is never cleared.
2006-02-06 21:54:47 +00:00
Olivier Houchard
2a3b10658d rwlock expects the struct thread to be aligned on 8 bytes, so make sure
thread0 is.
2006-02-06 16:03:10 +00:00
Jeff Roberson
04f6d3effa - Add a ref count to the mount structure. Sleep for up to 3 seconds in
vfs_mount_destroy waiting for this ref to hit 0.  We don't print an
   error if we are rebooting as the root mount always retains some refernces
   by init proc.
 - Acquire a mnt ref for every vnode allocated to a mount point.  Drop this
   ref only once vdestroy() has been called and the mount has been freed.
 - No longer NULL the v_mount pointer in delmntque() so that we may release
   the ref after vgone() has been called.  This allows us to guarantee
   that the mount point structure will be valid until the last vnode has
   lost its last ref.
 - Fix a few places that rely on checking v_mount to detect recycling.

Sponsored by:	Isilon Systems, Inc.
MFC After:	1 week
2006-02-06 10:19:50 +00:00
Jeff Roberson
2f0bca553a - Don't check v_mount for NULL to determine if a vnode has been recycled.
Use the more appropriate VI_DOOMED flag instead.

Sponsored by:	Isilon Systems, Inc.
MFC After:	1 week
2006-02-06 10:15:27 +00:00
Jeff Roberson
36a52c3cae - Add the global 'rebooting' variable that is used to detect when
boot() has been called.

Sponsored by:	Isilon Systems, Inc.
MFC After:	1 week
2006-02-06 10:12:00 +00:00
David Xu
ea8e65b0fa Add members pl_sigmask and pl_siglist into ptrace_lwpinfo to get lwp's
signal mask and pending signals.
2006-02-06 09:41:56 +00:00
Robert Watson
9653775b18 Regenerate. 2006-02-06 02:00:32 +00:00
Robert Watson
c983324ef5 Prefer AUE_FOO audit identifiers to AUE_O_FOO, which are largely left
over from the Darwin implementation.

When we implement a system call as a wrapper to sysctl(), audit it as
AUE_SYSCTL.  This leads to greater compatibility with Solaris audit
trails as sysctl() argument tokens are not the same as the ones for
the originaly system calls (i.e., setdomainname()).

Replace references to AUE_ events that are equivilent to AUE_NULL with
AUE_NULL.  In the case of process signal configuration, this is
because these events do not require auditing.

Move from the Darwin spelling of getsockopt() to the FreeBSD/Solaris
one.

Audit nmount().

Obtained from:	TrustedBSD Project
2006-02-06 02:00:06 +00:00
Robert Watson
89964dd284 When exiting a thread, submit any pending record. Today, we don't
audit thread exit, but should that happen, this will prevent
unhappiness, as the thread exit system call will never return, and
hence not commit the record.

Pointed out by/with:	cognet
Obtained from:		TrustedBSD Project
2006-02-06 01:51:08 +00:00
Wayne Salamon
2f8a46d5ff Audit the arguments (user/group IDs) for the system calls that set these IDs.
Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-02-06 00:32:33 +00:00
Wayne Salamon
ad20c8f325 Audit the args to rfork(), and the child PID for all fork system calls.
Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-02-06 00:28:50 +00:00
Wayne Salamon
de3007e8f3 Audit the pid being requested in wait4().
Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-02-06 00:19:09 +00:00
Wayne Salamon
a750d0b2a2 Add auditing of arguments to the close() and fstat() system calls. Much more
argument auditing yet to come, for remaining system calls in this file.

Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
2006-02-05 23:57:32 +00:00
Robert Watson
00c28d9678 On process exit, audit the return value of the process, and commit the
record immediately, as this system call never returns.

Obtained from:	TrustedBSD Project
2006-02-05 21:08:25 +00:00
Robert Watson
6e8525ce84 When GC'ing a thread, assert that it has no active audit record.
This should not happen, but with this assert, brueffer and I would
not have spent 45 minutes trying to figure out why he wasn't
seeing audit records with the audit version in CVS.

Obtained from:	TrustedBSD Project
2006-02-05 21:06:09 +00:00
Robert Watson
95fea57c65 Add AUDITVNODE[12] flags to namei(), which cause namei() to audit path
and vnode attribute information for looked up vnodes during the lookup
operation.  This will allow consumers of namei() to specify that this
information be added to the in-process audit record.

Submitted by:	wsalamon
Obtained from:	TrustedBSD Project
2006-02-05 15:42:01 +00:00
David Xu
25c926f1b0 Regenerate. 2006-02-05 02:23:41 +00:00
David Xu
9e7d72246f Implement thr_set_name to set a name for thread.
Reviewed by: julian
2006-02-05 02:18:46 +00:00