Commit Graph

237 Commits

Author SHA1 Message Date
Julian Elischer
71fad9fdee Completely redo thread states.
Reviewed by:	davidxu@freebsd.org
2002-09-11 08:13:56 +00:00
David Xu
1279572a92 s/SGNL/SIG/
s/SNGL/SINGLE/
s/SNGLE/SINGLE/

Fix abbreviation for P_STOPPED_* etc flags, in original code they were
inconsistent and difficult to distinguish between them.

Approved by: julian (mentor)
2002-09-05 07:30:18 +00:00
David Xu
35c32a76f9 In the kernel code, we have the tsleep() call with the PCATCH argument.
PCATCH means 'if we get a signal, interrupt me!" and tsleep returns
either EINTR or ERESTART depending on the circumstances.  ERESTART is
"special" because it causes the system call to fail, but right as it
returns back to userland it tells the trap handler to move %eip back a
bit so that userland will immediately re-run the syscall.
This is a syscall restart. It only works for things like read() etc where
nothing has changed yet. Note that *userland* is tricked into restarting
the syscall by the kernel. The kernel doesn't actually do the restart. It
is deadly for things like select, poll, nanosleep etc where it might cause
the elapsed time to be reset and start again from scratch.  So those
syscalls do this to prevent userland rerunning the syscall:
  if (error == ERESTART) error = EINTR;

Fake "signals" like SIGTSTP from ^Z etc do not normally invoke userland
signal handlers. But, in -current, the PCATCH *is* being triggered and
tsleep is returning ERESTART, and the syscall is aborted even though no
userland signal handler was run.
That is the fault here.  We're triggering the PCATCH in cases that we
shouldn't.  ie: it is being triggered on *any* signal processing, rather
than the case where the signal is posted to userland.
	--- Peter

The work of psignal() is a patchwork of special case required by the process
debugging and job-control facilities...
	--- Kirk McKusick
	"The design and impelementation of the 4.4BSD Operating system"
	Page 105

in STABLE source, when psignal is posting a STOP signal to sleeping
process and the signal action of the process is SIG_DFL, system will
directly change the process state from SSLEEP to SSTOP, and when
SIGCONT is posted to the stopped process, if it finds that the process
is still on sleep queue, the process state will be restored to SSLEEP,
and won't wakeup the process.

this commit mimics the behaviour in STABLE source tree.

Reviewed by: Jon Mini, Tim Robbins, Peter Wemm
Approved by: julian@freebsd.org (mentor)
2002-09-03 12:56:01 +00:00
Ian Dowse
8f19eb88df Split out a number of mostly VFS and signal related syscalls into
a kernel-internal kern_*() version and a wrapper that is called via
the syscall vector table. For paths and structure pointers, the
internal version either takes a uio_seg parameter or requires the
caller to copyin() the data to kernel memory as appropiate. This
will permit emulation layers to use these syscalls without having
to copy out translated arguments to the stack gap.

Discussed on:		-arch
Review/suggestions:	bde, jhb, peter, marcel
2002-09-01 20:37:28 +00:00
Julian Elischer
b39f32841b move the assert to cover more cases 2002-08-26 05:02:56 +00:00
Julian Elischer
d9d6e34fd0 Don't re-lock the sched lock if we didn't unlock it.
Original error by: David Xu <bsddiy@yahoo.com>
Fix by:	David Xu <bsddiy@yahoo.com>
Completely failed to spot it: Julian Elischer <julian@freebsd.org>
2002-08-23 07:23:44 +00:00
Julian Elischer
721e591067 Revert some suspension/sleep/signal code from KSE-III
We need to rethink a bit of this and it doesn't matter if
we break the KSE test program for now as long
as non-KSE programs act as expected.

Submitted by:	David Xu <bsddiy@yahoo.com>
	(this guy's just asking to get hit with a commit bit..)
2002-08-21 20:03:55 +00:00
Julian Elischer
6933e3c12b Do some work on keeping better track of stopped/continued state.
I'm not sure what happenned to the original setting of the P_CONTINUED
flag. it appears to have been lost in the paper shuffling...

Submitted by:	David Xu <bsddiy@yahoo.com>
2002-08-08 06:18:41 +00:00
Bruce Evans
1c530be49c Try harder to "set signal flags proprly [sic] for ast()". See rev.1.154. 2002-08-06 15:22:09 +00:00
Julian Elischer
04774f2357 Slight cleanup of some comments/whitespace.
Make idle process state more consistant.
Add an assert on thread state.
Clean up idleproc/mi_switch() interaction.
Use a local instead of referencing curthread 7 times in a row
(I've been told curthread can be expensive on some architectures)
Remove some commented out code.
Add a little commented out code (completion coming soon)

Reviewed by:	jhb@freebsd.org
2002-08-01 18:45:10 +00:00
Julian Elischer
4d492b4369 Don't need to hold schedlock specifically for stop() ans it calls wakeup()
that locks it anyhow.

Reviewed by: jhb@freebsd.org
2002-07-30 21:13:48 +00:00
Julian Elischer
38038891e9 revert some of the handling of STOP signals in
issignal(). Let thread_suspend_check() actually do the suspension
at the user boundary.

Submitted by:	David Xu <bsddiy@yahoo.com>
2002-07-24 07:23:41 +00:00
Don Lewis
832dafad3d Rearrange the code so that it checks whether the file is something
valid to write a core dump to before doing the preparations to actually
write to the file.

Call VOP_GETATTR() before dropping the initial vnode lock.
2002-07-10 06:31:35 +00:00
Julian Elischer
aa0fa33464 Try clean up some of the mess that resulted from layers and layers
of p4 merges from -current as things started getting different.

Corroborated by: Similar patches just mailed by BDE.
2002-07-03 09:15:20 +00:00
Julian Elischer
ee9919b024 White space commit.
I'm working on this file but I wanted to make the whitespece commit
separatly.
2002-07-03 06:15:26 +00:00
Andrew Gallatin
0ac3b6364f Hold the sched lock across call to forward_signal() in tdsignal() to
keep SMP systems from panic'ing when ^C'ing an app

suggested by julian
2002-07-03 02:55:48 +00:00
Julian Elischer
e602ba25fd Part 1 of KSE-III
The ability to schedule multiple threads per process
(one one cpu) by making ALL system calls optionally asynchronous.
to come: ia64 and power-pc patches, patches for gdb, test program (in tools)

Reviewed by:	Almost everyone who counts
	(at various times, peter, jhb, matt, alfred, mini, bernd,
	and a cast of thousands)

	NOTE: this is still Beta code, and contains lots of debugging stuff.
	expect slight instability in signals..
2002-06-29 17:26:22 +00:00
Alfred Perlstein
016091145e more caddr_t removal. 2002-06-29 02:00:02 +00:00
John Baldwin
374a15aa55 - trapsignal() no longer needs to acquire Giant for ktrpsig().
- Catch up to new ktrace API.
2002-06-07 05:43:02 +00:00
Chad David
ca18d53eae s/!SIGNOTEMPY/SIGISEMPTY/
Reviewed by: marcel, jhb, alfred
2002-06-06 19:12:41 +00:00
Mike Barcroft
6ee093fb8f Add POSIX.1-2001 WCONTINUED option for waitpid(2). A proc flag
(P_CONTINUED) is set when a stopped process receives a SIGCONT and
cleared after it has notified a parent process that has requested
notification via waitpid(2) with WCONTINUED specified in its options
operand.  The status value can be checked with the new WIFCONTINUED()
macro.

Reviewed by:	jake
2002-06-01 18:37:46 +00:00
Julian Elischer
628855e758 CURSIG() is not a macro so rename it cursig().
Obtained from:	KSE tree
2002-05-29 23:44:32 +00:00
John Baldwin
f44d9e24fb Change p_can{debug,see,sched,signal}()'s first argument to be a thread
pointer instead of a proc pointer and require the process pointed to
by the second argument to be locked.  We now use the thread ucred reference
for the credential checks in p_can*() as a result.  p_canfoo() should now
no longer need Giant.
2002-05-19 00:14:50 +00:00
Robert Watson
661016419c p_cansignal() returns an errno value; at some point, the check for
inter-process signalling ceased to preserve and return that value,
instead always returning EPERM.  This meant that it was possible
to "probe" the pid space for processes that were not otherwise
visible.  This change reverts that reversion.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-05-14 23:07:15 +00:00
Jonathan Mini
d8f4f6a404 Remove trace_req().
Reviewed by:	alfred, jhb, peter
2002-05-09 04:13:41 +00:00
Alfred Perlstein
8b43b53530 expand_name fixes:
.) don't use MAXPATHLEN + 1, fix logic to compensate.
.) style(9) function parameters.
.) fix line wrapping.
.) remove duplicated error and string handling code.
.) don't NUL terminate already NUL terminated string.
.) all string length variables changed from int to size_t.
.) constify variables.
.) catch when corename would be truncated.
.) cast pid_t and uid_t args for format string.
.) add parens around return arguments.

Help and suggestions from: bde
2002-05-08 09:06:47 +00:00
Alfred Perlstein
b2bc3101a8 M_ZERO the temp buffer in expand_name() otherwise if an error occurs
while logging we may pass a non NUL terminated string to log(9) for a
%s format arg.
2002-05-07 23:37:07 +00:00
Bruce Evans
f5216b9a19 Return the correct error code (ENOSYS, not EINVAL) from nosys(). Getting
killed by SIGSYS for unimlemented syscalls is bad enough.

Obtained from:	Lite2 branch

The Lite2 branch has some other interesting unmerged (?) bits in this
file.  They are well hidden among cosmetic regressions.
2002-05-05 04:50:47 +00:00
John Baldwin
9b3b1c5fdf - Reorder execve() so that it performs blocking operations before it
locks the process.
- Defer other blocking operations such as vrele()'s until after we
  release locks.
- execsigs() now requires the proc lock to be held when it is called
  rather than locking the process internally.
2002-05-02 15:00:14 +00:00
Alfred Perlstein
f132072368 Redo the sigio locking.
Turn the sigio sx into a mutex.

Sigio lock is really only needed to protect interrupts from dereferencing
the sigio pointer in an object when the sigio itself is being destroyed.

In order to do this in the most unintrusive manner change pgsigio's
sigio * argument into a **, that way we can lock internally to the
function.
2002-05-01 20:44:46 +00:00
Ian Dowse
ba1551ca81 Avoid the user-visible effect of setting SA_NOCLDWAIT when the
SIGCHLD handler is SIG_IGN. This is a reimplementation of the
problematic revision 1.131 of kern_exit.c. To avoid accessing process
UPAGES, we set a new procsig flag when the SIGCHLD handler is SIG_IGN
and use that instead.
2002-04-27 22:41:41 +00:00
John Baldwin
ba626c1db2 Lock proctree_lock instead of pgrpsess_lock. 2002-04-16 17:11:34 +00:00
John Baldwin
9c1ab3e04a - Change killpg1()'s first argument to be a thread instead of a process so
we can use td_ucred.
- In killpg1(), the proc lock is sufficient to check if p_stat is SZOMB
  or not.  We don't need sched_lock.
- Close some races in psignal().  In psignal() there is a big switch
  statement based on p_stat.  All the different cases are assuming that
  the process (or thread) isn't going to change state out from under it.
  To ensure this is true, just lock sched_lock for the entire switch.  We
  practically held it the entire time already anyways.  This also
  simplifies the locking somewhat and actually results in fewer lock
  operations.
- Allow signotify() to be called with the sched_lock held since psignal()
  now does that.
- Use td_ucred in a couple of places.
2002-04-13 23:33:36 +00:00
Bruce Evans
79065dba2a Moved signal handling and rescheduling from userret() to ast() so that
they aren't in the usual path of execution for syscalls and traps.
The main complication for this is that we have to set flags to control
ast() everywhere that changes the signal mask.

Avoid locking in userret() in most of the remaining cases.

Submitted by:	luoqi (first part only, long ago, reorganized by me)
Reminded by:	dillon
2002-04-04 17:49:48 +00:00
Bruce Evans
179235b38b Optimized the check for unmasked pending signals in CURSIG() using a new
inline function sigsetmasked() and a new macro SIGPENDING().  CURSIG()
will soon be moved out of the normal path of execution for syscalls and
traps.  Then its efficiency will be less important but the new interfaces
will be useful for checking for unmasked pending signals in more places.

Submitted by:		luoqi (long ago, in a slightly different form)

Assert that sched_lock is not held in CURSIG().
2002-04-04 15:19:41 +00:00
Bruce Evans
70f52b4845 Fixed some style bugs in the removal of __P(()). The main ones were
not removing tabs before "__P((", and not outdenting continuation lines
to preserve non-KNF lining up of code with parentheses.  Switch to KNF
formatting and/or rewrap the whole prototype in some cases.
2002-03-24 05:09:11 +00:00
Alfred Perlstein
4d77a549fe Remove __P. 2002-03-19 21:25:46 +00:00
Poul-Henning Kamp
0f5c7c4b1c Fix warning in !SMP case.
Submitted by:	 Maxime Henrion <mux@mu.org>
2002-02-26 09:21:52 +00:00
Seigo Tanimura
f591779bb5 Lock struct pgrp, session and sigio.
New locks are:

- pgrpsess_lock which locks the whole pgrps and sessions,
- pg_mtx which protects the pgrp members, and
- s_mtx which protects the session members.

Please refer to sys/proc.h for the coverage of these locks.

Changes on the pgrp/session interface:

- pgfind() needs the pgrpsess_lock held.

- The caller of enterpgrp() is responsible to allocate a new pgrp and
  session.

- Call enterthispgrp() in order to enter an existing pgrp.

- pgsignal() requires a pgrp lock held.

Reviewed by:	jhb, alfred
Tested on:	cvsup.jp.FreeBSD.org
		(which is a quad-CPU machine running -current)
2002-02-23 11:12:57 +00:00
Bruce Evans
8c3d74f4bf Fixed a typo in rev.1.65 that gave a reference to a nonexistent variable.
This was not detected by LINT because LINT is missing COMPAT_SUNOS.
2002-02-15 03:54:01 +00:00
Julian Elischer
2c1007663f In a threaded world, differnt priorirites become properties of
different entities.  Make it so.

Reviewed by:	jhb@freebsd.org (john baldwin)
2002-02-11 20:37:54 +00:00
Robert Watson
5da271f5a6 Add a comment indicating that VOP_GETATTR() is called without appropriate
locking in the core dump code.  This should be fixed.
2002-02-10 21:45:16 +00:00
Julian Elischer
079b7badea Pre-KSE/M3 commit.
this is a low-functionality change that changes the kernel to access the main
thread of a process via the linked list of threads rather than
assuming that it is embedded in the process. It IS still embeded there
but remove all teh code that assumes that in preparation for the next commit
which will actually move it out.

Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
2002-02-07 20:58:47 +00:00
Robert Watson
2b87b6d4f4 o Revert kern_sig.c#1.143, as cr_cansignal() doesn't currently permit
a number of desirable cases in which SIGIO/SIGURG are delivered.  We'll
  keep tweaking.

Reported by:	Alexander Kabaev <ak03@gte.com>
2002-01-10 01:25:35 +00:00
Robert Watson
f8efde8991 - Teach SIGIO code to use cr_cansignal() instead of a custom CANSIGIO()
macro.  As a result, mandatory signal delivery policies will be
  applied consistently across the kernel.

- Note that this subtly changes the protection semantics, and we should
  watch out for any resulting breakage.  Previously, delivery of SIGIO
  in this circumstance was limited to situations where the subject was
  privileged, or where one of the subject's (ruid, euid) matched one
  of the object's (ruid, euid).  In the new scenario, subject (ruid, euid)
  are matched against the object's (ruid, svuid), and the object uid's
  must be a subset of the subject uid's.  Likewise, jail now affects
  delivery, and special handling for P_SUGID of the object is present.
  This change can always be reversed or tweaked if it proves to disrupt
  application behavior substantially.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-01-06 00:54:46 +00:00
John Baldwin
c86b6ff551 Change the preemption code for software interrupt thread schedules and
mutex releases to not require flags for the cases when preemption is
not allowed:

The purpose of the MTX_NOSWITCH and SWI_NOSWITCH flags is to prevent
switching to a higher priority thread on mutex releease and swi schedule,
respectively when that switch is not safe.  Now that the critical section
API maintains a per-thread nesting count, the kernel can easily check
whether or not it should switch without relying on flags from the
programmer.  This fixes a few bugs in that all current callers of
swi_sched() used SWI_NOSWITCH, when in fact, only the ones called from
fast interrupt handlers and the swi_sched of softclock needed this flag.
Note that to ensure that swi_sched()'s in clock and fast interrupt
handlers do not switch, these handlers have to be explicitly wrapped
in critical_enter/exit pairs.  Presently, just wrapping the handlers is
sufficient, but in the future with the fully preemptive kernel, the
interrupt must be EOI'd before critical_exit() is called.  (critical_exit()
can switch due to a deferred preemption in a fully preemptive kernel.)

I've tested the changes to the interrupt code on i386 and alpha.  I have
not tested ia64, but the interrupt code is almost identical to the alpha
code, so I expect it will work fine.  PowerPC and ARM do not yet have
interrupt code in the tree so they shouldn't be broken.  Sparc64 is
broken, but that's been ok'd by jake and tmm who will be fixing the
interrupt code for sparc64 shortly.

Reviewed by:	peter
Tested on:	i386, alpha
2002-01-05 08:47:13 +00:00
Robert Watson
48f1ba5b0d o Wording fix in comment.
Submitted by:	tanimura via p4
2001-12-14 00:38:01 +00:00
Peter Wemm
6c1534a73e _SIG_MAXSIG (128) is the highest legal signal. The arrays are offset
by one - see _SIG_IDX().  Revert part of my mis-correction in kern_sig.c
(but signal 0 still has to be allowed) and fix _SIG_VALID() (it was
rejecting ignal 128).
2001-11-03 13:26:15 +00:00
Peter Wemm
049954de94 Partial reversion of rev 1.138. kill and killpg allow a signal
argument of 0.  You cannot return EINVAL for signal 0.  This broke
(in 5 minutes of testing) at least ssh-agent and screen.

However, there was a bug in the original code.  Signal 128 is not
valid.

Pointy-hat to: des, jhb
2001-11-03 12:36:16 +00:00
Dag-Erling Smørgrav
2899d60638 We have a _SIG_VALID() macro, so use it instead of duplicating the test all
over the place.  Also replace a printf() + panic() with a KASSERT().

Reviewed by:	jhb
2001-11-02 23:50:00 +00:00
Ian Dowse
80f42b555d Fix a typo in do_sigaction() where sa_sigaction and sa_handler were
confused. Since sa_sigaction and sa_handler alias each other in a
union, the bug was completely harmless. This had been fixed as part
of the SIGCHLD changes in revision 1.125, but it was reverted when
they were backed out in revision 1.126.
2001-10-07 16:11:37 +00:00
Paul Saab
88b1d98f31 Lock the vnode while truncating the corefile. This fixes a panic
with softupdates dangling deps.

Submitted by:	peter
MFC:		ASAP :)
2001-09-26 01:24:07 +00:00
Julian Elischer
fdd4e5c652 Replace line accidentally deleted during KSE additions.
Symptom.. Stopped program unable to be restarted if it was stopped
while already sleeping.
2001-09-17 20:42:25 +00:00
Robert Watson
9844fbc3b5 o Correct authorization check in CANSIGIO(), which suffered from incorrect
transcription during the (pcred,ucred) merge; this was not used for
  the kill() system call, so does not affect direct explicit process
  signalling.

Pointed out by:	fenner
2001-09-15 22:34:46 +00:00
Julian Elischer
b40ce4165d KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
Matthew Dillon
06ae1e91c4 This brings in a Yahoo coredump patch from Paul, with additional mods by
me (addition of vn_rdwr_inchunks).  The problem Yahoo is solving is that
if you have large process images core dumping, or you have a large number of
forked processes all core dumping at the same time, the original coredump code
would leave the vnode locked throughout.  This can cause the directory vnode
to get locked up, which can cause the parent directory vnode to get locked
up, and so on all the way to the root node, locking the entire machine up
for extremely long periods of time.

This patch solves the problem in two ways.  First it uses an advisory
non-blocking lock to abort multiple processes trying to core to the same
file.  Second (my contribution) it chunks up the writes and uses bwillwrite()
to avoid holding the vnode locked while blocking in the buffer cache.

Submitted by:	ps
Reviewed by:	dillon
MFC after:	2 weeks
2001-09-08 20:02:33 +00:00
John Baldwin
df53e91c18 Call sendsig() with the proc lock held and return with it held. 2001-09-06 22:20:41 +00:00
Matthew Dillon
fb99ab8811 Giant Pushdown
clock_gettime() clock_settime() nanosleep() settimeofday()
adjtime() getitimer() setitimer() __sysctl() ogetkerninfo()
sigaction() osigaction() sigpending() osigpending() osigvec()
osigblock() osigsetmask() sigsuspend() osigsuspend() osigstack()
sigaltstack() kill() okillpg() trapsignal() nosys()
2001-09-01 18:19:21 +00:00
Matthew Dillon
356861db03 Remove the MPSAFE keyword from the parser for syscalls.master.
Instead introduce the [M] prefix to existing keywords.  e.g.
MSTD is the MP SAFE version of STD.  This is prepatory for a
massive Giant lock pushdown.  The old MPSAFE keyword made
syscalls.master too messy.

Begin comments MP-Safe procedures with the comment:
/*
 * MPSAFE
 */
This comments means that the procedure may be called without
Giant held (The procedure itself may still need to obtain
Giant temporarily to do its thing).

sv_prepsyscall() is now MP SAFE and assumed to be MP SAFE
sv_transtrap() is now MP SAFE and assumed to be MP SAFE

ktrsyscall() and ktrsysret() are now MP SAFE (Giant Pushdown)
trapsignal() is now MP SAFE (Giant Pushdown)

Places which used to do the if (mtx_owned(&Giant)) mtx_unlock(&Giant)
test in syscall[2]() in */*/trap.c now do not.  Instead they
explicitly unlock Giant if they previously obtained it, and then
assert that it is no longer held to catch broken system calls.

Rebuild syscall tables.
2001-08-30 18:50:57 +00:00
Peter Pentchev
ccdbd10cb7 Prevent passing a null pointer as a filename to vn_open(),
if for some reason expand_name() failed to build a core file name.

PR:		29931
Submitted by:	Foldi Tamas <crow@kapu.hu>
Reviewed by:	dd, -arch
MFC after:	1 month
2001-08-24 15:49:30 +00:00
Peter Wemm
e8ebc08f80 Make COMPAT_43 optional again. XXX we need COMPAT_FBSD3 etc for this
stuff.
2001-08-21 02:32:59 +00:00
Peter Wemm
aa7a4dae6d Temporarily back out kern_sig.c rev 1.125 and kern_exit.c rev 1.131.
This paniced my one of my machines one time too many :-( and there is
no sign of a solution in the pipeline.  The deltas are still easily
available in cvs.  The problem is that if the parent has been swapped
out, the child process cannot grope around in the parent's UPAGES to
see the sigact[] array or it will fault.  This probably is a showstopper
for this implementation anyway.
2001-08-01 20:35:24 +00:00
Matthew Dillon
4fec48c6fe As per further discussions on hackers redo the SIGCHLD patch to not generate
an unexpected user-visible side effect with the sigaction flags.  Also cleanup
a minor union issue.

Submitted by: Rudolf Cejka <cejkar@dcse.fee.vutbr.cz>
MFC addendum: MFC will be combined w/ original commit
MFC after: 3 days
2001-07-22 18:47:31 +00:00
John Baldwin
64acb05b1c Grab Giant around postsig() since sendsig() can call into the vm to
grow the stack and we already needed Giant for KTRACE.
2001-07-03 05:27:53 +00:00
John Baldwin
2ad7d3049a - Change CURSIG() and postsig() to require that the proc lock is held
rather than grabbing it and releasing it themselves.  This allows callers
  of these functions to get the lock to close race conditions.
- Grab Giant around ktrace in postsig.
- Count the switches performed on SIGSTOP's as involuntary context switches
  in the resource usage stats.

Reported by:	tegge (signal race), bde (missing csw stats)
2001-06-22 23:02:37 +00:00
John Baldwin
6fad32afc9 Lock Giant in postsig() for the KTRACE case as ktrpsig() needs Giant when
it writes out to the trace file.

Reported by:	peter, gallatin, and others
2001-06-18 19:23:43 +00:00
David Malone
c7fd62da6c Try to make the setting of the SIGCHLD handler the same as setting of
the NOCLDWAI flag. Susv2 seems to require this.

Submitted by:	Cejka Rudolf <cejkar@dcse.fee.vutbr.cz>
Reviewed by:	dillon
2001-06-11 09:15:41 +00:00
Robert Watson
b1fc0ec1a7 o Merge contents of struct pcred into struct ucred. Specifically, add the
real uid, saved uid, real gid, and saved gid to ucred, as well as the
  pcred->pc_uidinfo, which was associated with the real uid, only rename
  it to cr_ruidinfo so as not to conflict with cr_uidinfo, which
  corresponds to the effective uid.
o Remove p_cred from struct proc; add p_ucred to struct proc, replacing
  original macro that pointed.
  p->p_ucred to p->p_cred->pc_ucred.
o Universally update code so that it makes use of ucred instead of pcred,
  p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo,
  cr_{r,sv}{u,g}id instead of p_*, etc.
o Remove pcred0 and its initialization from init_main.c; initialize
  cr_ruidinfo there.
o Restruction many credential modification chunks to always crdup while
  we figure out locking and optimizations; generally speaking, this
  means moving to a structure like this:
        newcred = crdup(oldcred);
        ...
        p->p_ucred = newcred;
        crfree(oldcred);
  It's not race-free, but better than nothing.  There are also races
  in sys_process.c, all inter-process authorization, fork, exec, and
  exit.
o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid;
  remove comments indicating that the old arrangement was a problem.
o Restructure exec1() a little to use newcred/oldcred arrangement, and
  use improved uid management primitives.
o Clean up exit1() so as to do less work in credential cleanup due to
  pcred removal.
o Clean up fork1() so as to do less work in credential cleanup and
  allocation.
o Clean up ktrcanset() to take into account changes, and move to using
  suser_xxx() instead of performing a direct uid==0 comparision.
o Improve commenting in various kern_prot.c credential modification
  calls to better document current behavior.  In a couple of places,
  current behavior is a little questionable and we need to check
  POSIX.1 to make sure it's "right".  More commenting work still
  remains to be done.
o Update credential management calls, such as crfree(), to take into
  account new ruidinfo reference.
o Modify or add the following uid and gid helper routines:
      change_euid()
      change_egid()
      change_ruid()
      change_rgid()
      change_svuid()
      change_svgid()
  In each case, the call now acts on a credential not a process, and as
  such no longer requires more complicated process locking/etc.  They
  now assume the caller will do any necessary allocation of an
  exclusive credential reference.  Each is commented to document its
  reference requirements.
o CANSIGIO() is simplified to require only credentials, not processes
  and pcreds.
o Remove lots of (p_pcred==NULL) checks.
o Add an XXX to authorization code in nfs_lock.c, since it's
  questionable, and needs to be considered carefully.
o Simplify posix4 authorization code to require only credentials, not
  processes and pcreds.  Note that this authorization, as well as
  CANSIGIO(), needs to be updated to use the p_cansignal() and
  p_cansched() centralized authorization routines, as they currently
  do not take into account some desirable restrictions that are handled
  by the centralized routines, as well as being inconsistent with other
  similar authorization instances.
o Update libkvm to take these changes into account.

Obtained from:	TrustedBSD Project
Reviewed by:	green, bde, jhb, freebsd-arch, freebsd-audit
2001-05-25 16:59:11 +00:00
John Baldwin
9081e5e826 - Remove unneeded include of sys/ipl.h.
- Require the proc lock be held for killproc() to allow for the vmdaemon to
  kill a process when memory is exhausted while holding the lock of the
  process to kill.
2001-05-15 23:13:58 +00:00
Akinori MUSHA
3b26be6ae1 Properly copy the P_ALTSTACK flag in struct proc::p_flag to the child
process on fork(2).

It is the supposed behavior stated in the manpage of sigaction(2), and
Solaris, NetBSD and FreeBSD 3-STABLE correctly do so.

The previous fix against libc_r/uthread/uthread_fork.c fixed the
problem only for the programs linked with libc_r, so back it out and
fix fork(2) itself to help those not linked with libc_r as well.

PR:		kern/26705
Submitted by:	KUROSAWA Takahiro <fwkg7679@mb.infoweb.ne.jp>
Tested by:	knu, GOTOU Yuuzou <gotoyuzo@notwork.org>,
		and some other people
Not objected by:	hackers
MFC in:		3 days
2001-05-07 18:07:29 +00:00
John Baldwin
6caa8a1501 Overhaul of the SMP code. Several portions of the SMP kernel support have
been made machine independent and various other adjustments have been made
to support Alpha SMP.

- It splits the per-process portions of hardclock() and statclock() off
  into hardclock_process() and statclock_process() respectively.  hardclock()
  and statclock() call the *_process() functions for the current process so
  that UP systems will run as before.  For SMP systems, it is simply necessary
  to ensure that all other processors execute the *_process() functions when the
  main clock functions are triggered on one CPU by an interrupt.  For the alpha
  4100, clock interrupts are delievered in a staggered broadcast fashion, so
  we simply call hardclock/statclock on the boot CPU and call the *_process()
  functions on the secondaries.  For x86, we call statclock and hardclock as
  usual and then call forward_hardclock/statclock in the MD code to send an IPI
  to cause the AP's to execute forwared_hardclock/statclock which then call the
  *_process() functions.
- forward_signal() and forward_roundrobin() have been reworked to be MI and to
  involve less hackery.  Now the cpu doing the forward sets any flags, etc. and
  sends a very simple IPI_AST to the other cpu(s).  AST IPIs now just basically
  return so that they can execute ast() and don't bother with setting the
  astpending or needresched flags themselves.  This also removes the loop in
  forward_signal() as sched_lock closes the race condition that the loop worked
  around.
- need_resched(), resched_wanted() and clear_resched() have been changed to take
  a process to act on rather than assuming curproc so that they can be used to
  implement forward_roundrobin() as described above.
- Various other SMP variables have been moved to a MI subr_smp.c and a new
  header sys/smp.h declares MI SMP variables and API's.   The IPI API's from
  machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h.
- The globaldata_register() and globaldata_find() functions as well as the
  SLIST of globaldata structures has become MI and moved into subr_smp.c.
  Also, the globaldata list is only available if SMP support is compiled in.

Reviewed by:	jake, peter
Looked over by:	eivind
2001-04-27 19:28:25 +00:00
John Baldwin
33a9ed9d0e Change the pfind() and zpfind() functions to lock the process that they
find before releasing the allproc lock and returning.

Reviewed by:	-smp, dfr, jake
2001-04-24 00:51:53 +00:00
Robert Watson
4c5eb9c397 o Replace p_cankill() with p_cansignal(), remove wrappage of p_can()
from signal authorization checking.
o p_cansignal() takes three arguments: subject process, object process,
  and signal number, unlike p_cankill(), which only took into account
  the processes and not the signal number, improving the abstraction
  such that CANSIGNAL() from kern_sig.c can now also be eliminated;
  previously CANSIGNAL() special-cased the handling of SIGCONT based
  on process session.  privused is now deprecated.
o The new p_cansignal() further limits the set of signals that may
  be delivered to processes with P_SUGID set, and restructures the
  access control check to allow it to be extended more easily.
o These changes take into account work done by the OpenBSD Project,
  as well as by Robert Watson and Thomas Moestl on the TrustedBSD
  Project.

Obtained from:  TrustedBSD Project
2001-04-12 02:38:08 +00:00
John Baldwin
5b3047d59f Change stop() to require the sched_lock as well as p's process lock to
avoid silly lock contention on sched_lock since in 2 out of the 3 places
that we call stop(), we get sched_lock right after calling it and we were
locking sched_lock inside of stop() anyways.
2001-04-03 01:39:23 +00:00
John Baldwin
1333047621 - Move the second stop() of process 'p' in issignal() to be after we send
SIGCHLD to our parent process.  Otherwise, we could block while obtaining
  the process lock for our parent process and switch out while we were
  in SSTOP.  Even worse, when we try to resume from the mutex being blocked
  on our p_stat will be SRUN, not SSTOP.
- Fix a comment above stop() to indicate that it requires that the proc lock
  be held, not a proctree lock.

Reported by:	markm
Sleuthing by:	jake
2001-04-02 17:26:51 +00:00
John Baldwin
1005a129e5 Convert the allproc and proctree locks from lockmgr locks to sx locks. 2001-03-28 11:52:56 +00:00
John Baldwin
c31146a14e - Resort some includes to deal with the new witness code coming in shortly.
- Make sure we have Giant locked before calling coredump() in sigexit().

Spotted by:	peter (2)
2001-03-28 08:41:04 +00:00
John Baldwin
628d2653d6 - Proc locking. Most of signal handling is now MP safe and doesn't require
Giant.  The only exception is the CANSIGNAL() macro.  Unlocking the proc
  lock around sendsig() in trapsignal() is also questionable.  Note that
  the functions sigexit(), psignal(), and issignal() must be called with
  the proc lock of the process in question held.  postsig() and
  trapsignal() should not be called with the proc lock held, but they
  also do not require Giant anymore either.
- Remove spl's that are now no longer needed as they are fully replaced.
2001-03-07 02:59:54 +00:00
Bruce Evans
d2ef4060d7 Fixed a longstanding latency bug in signal delivery. When a signal
is sent to a process, psignal() needs to schedule an AST for the
process if the process is runnable, not just if it is current, so that
pending signals get checked for on the next return of the process to
user mode.  This wasn't practical until recently because the AST flag
was per-cpu so setting it for a non-current process would usually just
cause a bogus AST for the current process.

For non-current processes looping in user mode, it took accidental
(?) magic to deliver signals at all.  Signals were usually delivered
late as a side effect of rescheduling (need_resched() sets astpending,
etc.).  In pre-SMPng, delivery was delayed by at most 1 quantum (the
need_resched() call in roundrobin() is certain to occur within 1
quantum for looping processes).  In -current, things are complicated
by normal interrupt handlers being threads.  Missing handling of the
complications makes roundrobin() a bogus no-op, but preemptive
scheduling sort of works anyway due to even larger bogons elsewhere.
2001-02-19 09:40:58 +00:00
Jake Burkholder
d5a08a6065 Implement a unified run queue and adjust priority levels accordingly.
- All processes go into the same array of queues, with different
  scheduling classes using different portions of the array.  This
  allows user processes to have their priorities propogated up into
  interrupt thread range if need be.
- I chose 64 run queues as an arbitrary number that is greater than
  32.  We used to have 4 separate arrays of 32 queues each, so this
  may not be optimal.  The new run queue code was written with this
  in mind; changing the number of run queues only requires changing
  constants in runq.h and adjusting the priority levels.
- The new run queue code takes the run queue as a parameter.  This
  is intended to be used to create per-cpu run queues.  Implement
  wrappers for compatibility with the old interface which pass in
  the global run queue structure.
- Group the priority level, user priority, native priority (before
  propogation) and the scheduling class into a struct priority.
- Change any hard coded priority levels that I found to use
  symbolic constants (TTIPRI and TTOPRI).
- Remove the curpriority global variable and use that of curproc.
  This was used to detect when a process' priority had lowered and
  it should yield.  We now effectively yield on every interrupt.
- Activate propogate_priority().  It should now have the desired
  effect without needing to also propogate the scheduling class.
- Temporarily comment out the call to vm_page_zero_idle() in the
  idle loop.  It interfered with propogate_priority() because
  the idle process needed to do a non-blocking acquire of Giant
  and then other processes would try to propogate their priority
  onto it.  The idle process should not do anything except idle.
  vm_page_zero_idle() will return in the form of an idle priority
  kernel thread which is woken up at apprioriate times by the vm
  system.
- Update struct kinfo_proc to the new priority interface.  Deliberately
  change its size by adjusting the spare fields.  It remained the same
  size, but the layout has changed, so userland processes that use it
  would parse the data incorrectly.  The size constraint should really
  be changed to an arbitrary version number.  Also add a debug.sizeof
  sysctl node for struct kinfo_proc.
2001-02-12 00:20:08 +00:00
John Baldwin
142ba5f3d7 - Make astpending and need_resched process attributes rather than CPU
attributes.  This is needed for AST's to be properly posted in a preemptive
  kernel.  They are backed by two new flags in p_sflag: PS_ASTPENDING and
  PS_NEEDRESCHED.  They are still accesssed by their old macros:
  aston(), astoff(), etc.  For completeness, an astpending() macro has been
  added to check for a pending AST, and clear_resched() has been added to
  clear need_resched().
- Rename syscall2() on the x86 back to syscall() to be consistent with
  other architectures.
2001-02-10 02:20:34 +00:00
Bosko Milekic
9ed346bab0 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
John Baldwin
40447cd4aa - Proc locking.
- Catch up to proc flag changes.
2001-01-24 11:08:02 +00:00
John Baldwin
568ae39fd5 Revert revision 1.102. I don't think p_nice needs to be protected with
sched_lock, and I'm fairly certain P_TRACED will be protected with the
proc lock instead.

Pointed out indirectly by:	bde
2001-01-19 08:23:22 +00:00
Jason Evans
238510fc46 Implement condition variables. 2001-01-16 01:00:43 +00:00
John Baldwin
5192404af2 Protect p_nice and P_TRACED in psignal() above the switch statement with
sched_lock.
2001-01-06 00:08:39 +00:00
John Baldwin
3e6831f510 The previous commit wasn't entirely correct. At least one goto to the
out: label in psignal() did not grab sched_lock before trying to release
it.  Also, the previous version had several cases where it grabbed
sched_lock before jumping to out: unneccessarily, so rework this a bit.
The runfast: and out: labels must be called with sched_lock released, and
the run: label must be called with it held.  Appropriate mtx_assert()'s
have been added that should catch any bugs that may still be in this
code.

Noticed by:	bde
2001-01-02 18:54:09 +00:00
John Baldwin
4bfba0cf19 Push down sched_lock in psignal(). sched_lock was being held across
recursive calls into psignal() as well as calls to signotify(),
forward_signal(), etc.
2001-01-01 02:31:08 +00:00
John Baldwin
ef8294075b Add in a missing release of the proctree lock.
Submitted by:	Sja <sakari.jalovaara@eqonline.fi>
2001-01-01 02:19:51 +00:00
Jake Burkholder
98f03f9030 Protect proc.p_pptr and proc.p_children/p_sibling with the
proctree_lock.

linprocfs not locked pending response from informal maintainer.

Reviewed by:	jhb, -smp@
2000-12-23 19:43:10 +00:00
Marcel Moolenaar
d96cfeae0c Fix a typo that allowed signals caused by traps to be delivered
to the process when said signal is masked.

PR: 23457
Submitted by: Yasuhiko Watanabe <yasu@mrit.mei.co.jp>
2000-12-16 21:03:48 +00:00
Jake Burkholder
c0c2557090 - Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead
of explicit calls to lockmgr.  Also provides macros for the flags
  pased to specify shared, exclusive or release which map to the
  lockmgr flags.  This is so that the use of lockmgr can be easily
  replaced with optimized reader-writer locks.
- Add some locking that I missed the first time.
2000-12-13 00:17:05 +00:00
John Baldwin
1c32c37c06 Protect p_stat with sched_lock. 2000-12-01 23:43:15 +00:00
Marcel Moolenaar
d034d459da Don't use p->p_sigstk.ss_flags to keep state of whether the
process is on the alternate stack or not. For compatibility
with sigstack(2) state is being updated if such is needed.

We now determine whether the process is on the alternate
stack by looking at its stack pointer. This allows a process
to siglongjmp from a signal handler on the alternate stack
to the place of the sigsetjmp on the normal stack. When
maintaining state, this would have invalidated the state
information and causing a subsequent signal to be delivered
on the normal stack instead of the alternate stack.

PR: 22286
2000-11-30 05:23:49 +00:00
Jake Burkholder
553629ebc9 Protect the following with a lockmgr lock:
allproc
	zombproc
	pidhashtbl
	proc.p_list
	proc.p_hash
	nextpid

Reviewed by:	jhb
Obtained from:	BSD/OS and netbsd
2000-11-22 07:42:04 +00:00
Jake Burkholder
7da6f97772 - Split the run queue and sleep queue linkage, so that a process
may block on a mutex while on the sleep queue without corrupting
it.
- Move dropping of Giant to after the acquire of sched_lock.

Tested by:	John Hay <jhay@icomtek.csir.co.za>
		jhb
2000-11-17 18:09:18 +00:00
John Baldwin
20cdcc5b73 Don't release and acquire Giant in mi_switch(). Instead, release and
acquire Giant as needed in functions that call mi_switch().  The releases
need to be done outside of the sched_lock to avoid potential deadlocks
from trying to acquire Giant while interrupts are disabled.

Submitted by:	witness
2000-11-16 02:16:44 +00:00
Marcel Moolenaar
806d7daafe Make MINSIGSTKSZ machine dependent, and have the sigaltstack
syscall compare against a variable sv_minsigstksz in struct
sysentvec as to properly take the size of the machine- and
ABI dependent struct sigframe into account.

The SVR4 and iBCS2 modules continue to have a minsigstksz of
8192 to preserve behavior. The real values (if different) are
not known at this time. Other ABI modules use the real
values.

The native MINSIGSTKSZ is now defined as follows:

Arch		MINSIGSTKSZ
----		-----------
alpha		    4096
i386		    2048
ia64		   12288

Reviewed by: mjacob
Suggested by: bde
2000-11-09 08:25:48 +00:00
John Baldwin
35e0e5b311 Catch up to moving headers:
- machine/ipl.h -> sys/ipl.h
- machine/mutex.h -> sys/mutex.h
2000-10-20 07:58:15 +00:00
Bruce Evans
33510ef17a Unpessimized CURSIG(). The fast path through CURSIG() was broken in
the 128-bit sigset_t changes by moving conditionally (rarely) executed
code to the beginning where it is always executed, and since this code
now involves 3 128-bit operations, the pessimization was relatively
large.  This change speeds up lmbench's pipe latency benchmark by
3.5%.

Fixed style bugs in CURSIG().
2000-09-17 15:12:04 +00:00