Commit Graph

235 Commits

Author SHA1 Message Date
Bruce Evans
40a3fa2d59 (1) Removed the bogus condition "p->p_pid != 1" on calling sched_exit()
from exit1().  sched_exit() must be called unconditionally from exit1().
    It was called almost unconditionally because the only exits on system
    shutdown if at all.

(2) Removed the comment that presumed to know what sched_exit() does.
    sched_exit() does different things for the ULE case.  The call became
    essential when it started doing load average stuff, but its caller
    should not know that.

(3) Didn't fix bugs caused by bitrot in the condition.  The condition was
    last correct in rev.1.208 when it was in wait1().  There p was spelled
    curthread->td_proc and was for the waiting parent; now p is for the
    exiting child.  The condition was to avoid lowering init's priority.
    It should be in sched_exit() itself.  Lowering of priorities is broken
    in other ways in at least the 4BSD scheduler, and doing it for init
    causes less noticeable problems than doing it for for shells.

Noticed by:	julian (1)
2004-06-21 14:49:50 +00:00
Bruce Evans
871684b822 Update p_runtime on exit. This fixes calcru() on zombies, and prepares
for not calling calcru() on exit.  calcru() on a zombie can happen if
ttyinfo() (^T) picks one.

PR:		52490
2004-06-21 14:03:38 +00:00
David Xu
b370279ef8 Add comment to reflect that we should retry after thread singling failed. 2004-06-18 11:13:49 +00:00
David Xu
0aabef657e Remove a bogus panic. It is possible more than one threads will
be suspended in thread_suspend_check, after they are resumed, all
threads will call thread_single, but only one can be success,
others should retry and will exit in thread_suspend_check.
2004-06-18 06:21:09 +00:00
Tim J. Robbins
f55530b436 Remove remnants of PGINPROF. 2004-06-08 10:37:30 +00:00
Tim J. Robbins
aa0aa7a113 Move TDF_SA from td_flags to td_pflags (and rename it accordingly)
so that it is no longer necessary to hold sched_lock while
manipulating it.

Reviewed by:	davidxu
2004-06-02 07:52:36 +00:00
Thomas Moestl
65e29c4822 Retire cpu_sched_exit(); it is not used any more. 2004-05-26 12:09:39 +00:00
David Xu
702ac0f112 Clear KSE thread flags after KSE thread mode is ended. The side effect
of not clearing the flags for execv() syscall will result that a new
program runs in KSE thread mode without enabling it.

Submitted by: tjr
Modified by: davidxu
2004-05-21 14:50:23 +00:00
Julian Elischer
b324899838 Remove misplaced duplicate comment and slightly reformat the
version that was in the right place.
2004-05-09 22:29:14 +00:00
Warner Losh
7f8a436ff2 Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core
2004-04-05 21:03:37 +00:00
Brian Feldman
150883179a Add the missing Giant when doing anything with VFS -- in this case,
releasing the ktrace vnode.
2004-03-18 18:15:58 +00:00
John Baldwin
b7e23e826c - Replace wait1() with a kern_wait() function that accepts the pid,
options, status pointer and rusage pointer as arguments.  It is up to
  the caller to copyout the status and rusage to userland if needed.  This
  lets us axe the 'compat' argument and hide all that functionality in
  owait(), by the way.  This also cleans up some locking in kern_wait()
  since it no longer has to drop locks around copyout() since all the
  copyout()'s are deferred.
- Convert owait(), wait4(), and the various ABI compat wait() syscalls to
  use kern_wait() rather than wait1() or wait4().  This removes a bit
  more stackgap usage.

Tested on:	i386
Compiled on:	i386, alpha, amd64
2004-03-17 20:00:00 +00:00
Peter Wemm
a5bdcb2a2f Make the process_exit eventhandler run without Giant. Add Giant hooks
in the two consumers that need it.. processes using AIO and netncp.
Update docs.  Say that process_exec is called with Giant, but not to
depend on it.  All our consumers can handle it without Giant.
2004-03-14 02:06:28 +00:00
Peter Wemm
37814395c1 Push Giant down a little further:
- no longer serialize on Giant for thread_single*() and family in fork,
  exit and exec
- thread_wait() is mpsafe, assert no Giant
- reduce scope of Giant in exit to not cover thread_wait and just do
  vm_waitproc().
- assert that thread_single() family are not called with Giant
- remove the DROP/PICKUP_GIANT macros from thread_single() family
- assert that thread_suspend_check() s not called with Giant
- remove manual drop_giant hack in thread_suspend_check since we know it
  isn't held.
- remove the DROP/PICKUP_GIANT macros from thread_suspend_check() family
- mark kse_create() mpsafe
2004-03-13 22:31:39 +00:00
John Baldwin
4ae89b957c - Push down Giant in exit() and wait().
- Push Giant down a bit in coredump() and call coredump() with the proc
  lock already held rather than unlocking it only to turn around and
  relock it.

Requested by:	peter
2004-03-05 22:39:53 +00:00
John Baldwin
e5bb601d87 Drop sched_lock around the wakeup of the parent process after setting
the process state to zombie when a process exits to avoid a lock order
reversal with the sleepqueue locks.  This appears to be the only place
that we call wakeup() with sched_lock held.
2004-02-27 18:39:09 +00:00
Don Lewis
6567eef757 A Linux thread created using clone() should not send SIGCHLD to its
parent if no signal is specified in the clone() flags argument.

PR:		42457
MFC after:	2 weeks
2004-02-19 06:43:48 +00:00
Don Lewis
55b5f2a202 When reparenting a process to init, make sure that p_sigparent is
set to SIGCHLD.  This avoids the creation of orphaned Linux-threaded
zombies that init is unable to reap.  This can occur when the parent
process sets its SIGCHLD to SIG_IGN.  Fix a similar situation in the
PT_DETACH code.

Tested by:	"Steven Hartland" <killing AT multiplay.co.uk>
2004-02-11 22:06:02 +00:00
John Baldwin
91d5354a2c Locking for the per-process resource limits structure.
- struct plimit includes a mutex to protect a reference count.  The plimit
  structure is treated similarly to struct ucred in that is is always copy
  on write, so having a reference to a structure is sufficient to read from
  it without needing a further lock.
- The proc lock protects the p_limit pointer and must be held while reading
  limits from a process to keep the limit structure from changing out from
  under you while reading from it.
- Various global limits that are ints are not protected by a lock since
  int writes are atomic on all the archs we support and thus a lock
  wouldn't buy us anything.
- All accesses to individual resource limits from a process are abstracted
  behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return
  either an rlimit, or the current or max individual limit of the specified
  resource from a process.
- dosetrlimit() was renamed to kern_setrlimit() to match existing style of
  other similar syscall helper functions.
- The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit()
  (it didn't used the stackgap when it should have) but uses lim_rlimit()
  and kern_setrlimit() instead.
- The svr4 compat no longer uses the stackgap for resource limits calls,
  but uses lim_rlimit() and kern_setrlimit() instead.
- The ibcs2 compat no longer uses the stackgap for resource limits.  It
  also no longer uses the stackgap for accessing sysctl's for the
  ibcs2_sysconf() syscall but uses kernel_sysctl() instead.  As a result,
  ibcs2_sysconf() no longer needs Giant.
- The p_rlimit macro no longer exists.

Submitted by:	mtm (mostly, I only did a few cleanups and catchups)
Tested on:	i386
Compiled on:	alpha, amd64
2004-02-04 21:52:57 +00:00
Robert Watson
679365e7b9 Reduce gratuitous includes: don't include jail.h if it's not needed.
Presumably, at some point, you had to include jail.h if you included
proc.h, but that is no longer required.

Result of:	self injury involving adding something to struct prison
2004-01-21 17:10:47 +00:00
Olivier Houchard
1a29c80648 Better fix than my previous commit:
in exit1(), make sure the p_klist is empty after sending NOTE_EXIT.
The process won't report fork() or execve() and won't be able to handle
NOTE_SIGNAL knotes anyway.
This fixes some race conditions with do_tdsignal() calling knote() while
the process is exiting.

Reported by:	Stefan Farfeleder <stefan@fafoe.narf.at>
MFC after:	1 week
2003-11-14 18:49:01 +00:00
David Xu
0e2a4d3aeb Rename P_THREADED to P_SA. P_SA means a process is using scheduler
activations.
2003-06-15 00:31:24 +00:00
David E. O'Brien
677b542ea2 Use __FBSDID(). 2003-06-11 00:56:59 +00:00
John Baldwin
5499ea019d Wait for the real interval timer callout handler to finish executing if it
is currently executing when we try to remove it in exit1().  Without this,
it was possible for the callout to bogusly rearm itself and eventually
refire after the process had been free'd resulting in a panic.

PR:		kern/51964
Reported by:	Jilles Tjoelker <jilles@stack.nl>
Reviewed by:	tegge, bde
2003-06-09 21:46:22 +00:00
John Baldwin
90af4afacb - Merge struct procsig with struct sigacts.
- Move struct sigacts out of the u-area and malloc() it using the
  M_SUBPROC malloc bucket.
- Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(),
  sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared().
- Remove the p_sigignore, p_sigacts, and p_sigcatch macros.
- Add a mutex to struct sigacts that protects all the members of the struct.
- Add sigacts locking.
- Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now
  that sigacts is locked.
- Several in-kernel functions such as psignal(), tdsignal(), trapsignal(),
  and thread_stopped() are now MP safe.

Reviewed by:	arch@
Approved by:	re (rwatson)
2003-05-13 20:36:02 +00:00
John Baldwin
7d447c956b Initialize and destroy the struct proc mutex in the proc zone's init and
fini routines instead of in fork() and wait().  This has the nice side
benefit that the proc lock of any process on the allproc list is always
valid and sched_lock doesn't have to be used to test against PRS_NEW
anymore.
2003-05-01 21:16:38 +00:00
John Baldwin
112afcb232 - Protect p_numthreads with the sched_lock.
- Protect p_singlethread with both the sched_lock and the proc lock.
- Protect p_suspcount with the proc lock.
2003-04-23 18:46:51 +00:00
David Xu
11b20c685b Fix lock order reversal problem. 2003-04-21 14:42:04 +00:00
John Baldwin
462f31bff0 Adjust a few comments. 2003-04-17 22:22:47 +00:00
Jeff Roberson
f6f230febe - Adjust sched hooks for fork and exec to take processes as arguments instead
of ksegs since they primarily operation on processes.
 - KSEs take ticks so pass the kse through sched_clock().
 - Add a sched_class() routine that adjusts a ksegrp pri class.
 - Define a sched_fork_{kse,thread,ksegrp} and sched_exit_{kse,thread,ksegrp}
   that will be used to tell the scheduler about new instances of these
   structures within the same process.  These will be used by THR and KSE.
 - Change sched_4bsd to reflect this API update.
2003-04-11 03:39:07 +00:00
Jeff Roberson
2c10d16a4b - Borrow the KSE single threading code for exec and exit. We use the check
if (p->p_numthreads > 1) and not a flag because action is only necessary
   if there are other threads.  The rest of the system has no need to
   identify thr threaded processes.
 - In kern_thread.c use thr_exit1() instead of thread_exit() if P_THREADED
   is not set.
2003-04-01 01:26:20 +00:00
Jeff Roberson
4093529dee - Move p->p_sigmask to td->td_sigmask. Signal masks will be per thread with
a follow on commit to kern_sig.c
 - signotify() now operates on a thread since unmasked pending signals are
   stored in the thread.
 - PS_NEEDSIGCHK moves to TDF_NEEDSIGCHK.
2003-03-31 22:49:17 +00:00
John Baldwin
75b8b3b25c Replace the at_fork, at_exec, and at_exit functions with the slightly more
flexible process_fork, process_exec, and process_exit eventhandlers.  This
reduces code duplication and also means that I don't have to go duplicate
the eventhandler locking three more times for each of at_fork, at_exec, and
at_exit.

Reviewed by:	phk, jake, almost complete silence on arch@
2003-03-24 21:15:35 +00:00
Dag-Erling Smørgrav
830c3153c6 Unregisterize, ansify. 2003-03-19 00:49:40 +00:00
Dag-Erling Smørgrav
4e8074eba2 Whitespace cleanup. 2003-03-19 00:33:38 +00:00
John Baldwin
a5881ea55a - Cache a reference to the credential of the thread that starts a ktrace in
struct proc as p_tracecred alongside the current cache of the vnode in
  p_tracep.  This credential is then used for all later ktrace operations on
  this file rather than using the credential of the current thread at the
  time of each ktrace event.
- Now that we have multiple ktrace-related items in struct proc that are
  pointers, rename p_tracep to p_tracevp to make it less ambiguous.

Requested by:	rwatson (1)
2003-03-13 18:24:22 +00:00
Tim J. Robbins
6ec62361c8 Tidy up previous change: move comment about obtaining an exclusive
reference where it belongs, and remove a blank line to make it more
obvious what the comment applies to.
2003-03-13 00:57:47 +00:00
Tim J. Robbins
3890793e9c In wait1(), remove the zombie process from zombproc before removing
it from its pgrp to avoid leaving zombies around with p_pgrp == NULL.
This bug was apparent as a NULL-dereference in the pid selection code
in fork1().
2003-03-12 11:10:04 +00:00
David Xu
e574e444e0 Fix threaded process job control bug. SMP tested.
Reviewed by: julian
2003-03-11 00:07:53 +00:00
Julian Elischer
ac2e415327 Change the process flags P_KSES to be P_THREADED.
This is just a cosmetic change but I've been meaning to do it for about a year.
2003-02-27 02:05:19 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Tim J. Robbins
96d7f8ef46 Use the proc lock to protect p_realtimer instead of Giant, and obtain
sched_lock around accesses to p_stats->p_timer[] to avoid a potential
race with hardclock. getitimer(), setitimer() and the realitexpire()
callout are now Giant-free.
2003-02-17 10:03:02 +00:00
Jeff Roberson
5215b1872f - Split the struct kse into struct upcall and struct kse. struct kse will
soon be visible only to schedulers.  This greatly simplifies much the
   KSE code.

Submitted by:	davidxu
2003-02-17 05:14:26 +00:00
Alfred Perlstein
edf6699ae6 Fix LOR with PROC/filedesc. Introduce fdesc_mtx that will be used as a
barrier between free'ing filedesc structures.  Basically if you want to
access another process's filedesc, you want to hold this mutex over the
entire operation.
2003-02-15 05:52:56 +00:00
Julian Elischer
a282253a29 A little infrastructure, preceding some upcoming changes
to the profiling and statistics code.

Submitted by:	DavidXu@
Reviewed by:	peter@
2003-02-08 02:58:16 +00:00
Julian Elischer
6f8132a867 Reversion of commit by Davidxu plus fixes since applied.
I'm not convinced there is anything major wrong with the patch but
them's the rules..

I am using my "David's mentor" hat to revert this as he's
offline for a while.
2003-02-01 12:17:09 +00:00
David Xu
0dbb100b9b Move UPCALL related data structure out of kse, introduce a new
data structure called kse_upcall to manage UPCALL. All KSE binding
and loaning code are gone.

A thread owns an upcall can collect all completed syscall contexts in
its ksegrp, turn itself into UPCALL mode, and takes those contexts back
to userland. Any thread without upcall structure has to export their
contexts and exit at user boundary.

Any thread running in user mode owns an upcall structure, when it enters
kernel, if the kse mailbox's current thread pointer is not NULL, then
when the thread is blocked in kernel, a new UPCALL thread is created and
the upcall structure is transfered to the new UPCALL thread. if the kse
mailbox's current thread pointer is NULL, then when a thread is blocked
in kernel, no UPCALL thread will be created.

Each upcall always has an owner thread. Userland can remove an upcall by
calling kse_exit, when all upcalls in ksegrp are removed, the group is
atomatically shutdown. An upcall owner thread also exits when process is
in exiting state. when an owner thread exits, the upcall it owns is also
removed.

KSE is a pure scheduler entity. it represents a virtual cpu. when a thread
is running, it always has a KSE associated with it. scheduler is free to
assign a KSE to thread according thread priority, if thread priority is changed,
KSE can be moved from one thread to another.

When a ksegrp is created, there is always N KSEs created in the group. the
N is the number of physical cpu in the current system. This makes it is
possible that even an userland UTS is single CPU safe, threads in kernel still
can execute on different cpu in parallel. Userland calls kse_create to add more
upcall structures into ksegrp to increase concurrent in userland itself, kernel
is not restricted by number of upcalls userland provides.

The code hasn't been tested under SMP by author due to lack of hardware.

Reviewed by: julian
2003-01-26 11:41:35 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Matthew Dillon
3db161e079 It is possible for an active aio to prevent shared memory from being
dereferenced when a process exits due to the vmspace ref-count being
bumped.  Change shmexit() and shmexit_myhook() to take a vmspace instead
of a process and call it in vmspace_dofree().  This way if it is missed
in exit1()'s early-resource-free it will still be caught when the zombie is
reaped.

Also fix a potential race in shmexit_myhook() by NULLing out
vmspace->vm_shm prior to calling shm_delete_mapping() and free().

MFC after:	7 days
2003-01-13 23:04:32 +00:00
Matthew Dillon
389d2b6e21 Fix a refcount race with the vmspace structure. In order to prevent
resource starvation we clean-up as much of the vmspace structure as we
can when the last process using it exits.  The rest of the structure
is cleaned up when it is reaped.  But since exit1() decrements the ref
count it is possible for a double-free to occur if someone else, such as
the process swapout code, references and then dereferences the structure.
Additionally, the final cleanup of the structure should not occur until
the last process referencing it is reaped.

This commit solves the problem by introducing a secondary reference count,
calling 'vm_exitingcnt'.  The normal reference count is decremented on exit
and vm_exitingcnt is incremented.  vm_exitingcnt is decremented when the
process is reaped.  When both vm_exitingcnt and vm_refcnt are 0, the
structure is freed for real.

MFC after:	3 weeks
2002-12-15 18:50:04 +00:00