Commit Graph

81 Commits

Author SHA1 Message Date
David Xu
0dbb100b9b Move UPCALL related data structure out of kse, introduce a new
data structure called kse_upcall to manage UPCALL. All KSE binding
and loaning code are gone.

A thread owns an upcall can collect all completed syscall contexts in
its ksegrp, turn itself into UPCALL mode, and takes those contexts back
to userland. Any thread without upcall structure has to export their
contexts and exit at user boundary.

Any thread running in user mode owns an upcall structure, when it enters
kernel, if the kse mailbox's current thread pointer is not NULL, then
when the thread is blocked in kernel, a new UPCALL thread is created and
the upcall structure is transfered to the new UPCALL thread. if the kse
mailbox's current thread pointer is NULL, then when a thread is blocked
in kernel, no UPCALL thread will be created.

Each upcall always has an owner thread. Userland can remove an upcall by
calling kse_exit, when all upcalls in ksegrp are removed, the group is
atomatically shutdown. An upcall owner thread also exits when process is
in exiting state. when an owner thread exits, the upcall it owns is also
removed.

KSE is a pure scheduler entity. it represents a virtual cpu. when a thread
is running, it always has a KSE associated with it. scheduler is free to
assign a KSE to thread according thread priority, if thread priority is changed,
KSE can be moved from one thread to another.

When a ksegrp is created, there is always N KSEs created in the group. the
N is the number of physical cpu in the current system. This makes it is
possible that even an userland UTS is single CPU safe, threads in kernel still
can execute on different cpu in parallel. Userland calls kse_create to add more
upcall structures into ksegrp to increase concurrent in userland itself, kernel
is not restricted by number of upcalls userland provides.

The code hasn't been tested under SMP by author due to lack of hardware.

Reviewed by: julian
2003-01-26 11:41:35 +00:00
Matthew Dillon
e3669cee72 Merge all the various copies of vm_fault_quick() into a single
portable copy.
2003-01-16 00:02:21 +00:00
Matthew Dillon
f597900329 Merge all the various copies of vmapbuf() and vunmapbuf() into a single
portable copy.  Note that pmap_extract() must be used instead of
pmap_kextract().

This is precursor work to a reorganization of vmapbuf() to close remaining
user/kernel races (which can lead to a panic).
2003-01-15 23:54:35 +00:00
Peter Grehan
7d13e76928 Add page queues locking to vunmapbuf().
Obtained from: sparc64
Approved by:  benno
2003-01-08 12:29:59 +00:00
Julian Elischer
696058c3c5 Unbreak the KSE code. Keep track of zobie threads using the Per-CPU storage
during the context switch. Rearrange thread cleanups
to avoid problems with Giant. Clean threads when freed or
when recycled.

Approved by:	re (jhb)
2002-12-10 02:33:45 +00:00
Maxime Henrion
b19d9defef Under certain circumstances, we were calling kmem_free() from
i386 cpu_thread_exit().  This resulted in a panic with WITNESS
since we need to hold Giant to call kmem_free(), and we weren't
helding it anymore in cpu_thread_exit().  We now do this from a
new MD function, cpu_thread_dtor(), called by thread_dtor().

Approved by:	re@
Suggested by:	jhb
2002-11-22 23:57:02 +00:00
Peter Grehan
f86c114f3d Add the USER_SR segment register to pcb state. Initialize correctly,
and save/restore during a context switch.

The USER_SR could be overwritten when the current thread was switched
out with a faulting copyin/copyout.

Approved by: Benno
2002-10-21 05:27:41 +00:00
Peter Grehan
4eae112d86 - bring vm_mapbuf/unmapbuf in line with other archs
- update for recent KSE changes

Approved by: benno
2002-09-19 04:39:28 +00:00
Peter Wemm
1a50106e30 Zap the implementations of the i386-aout specific cpu_coredump function.
Most of the non-i386 platforms had rather broken implementations anyway.
2002-09-07 01:26:34 +00:00
Robert Watson
9ca435893b In order to better support flexible and extensible access control,
make a series of modifications to the credential arguments relating
to file read and write operations to cliarfy which credential is
used for what:

- Change fo_read() and fo_write() to accept "active_cred" instead of
  "cred", and change the semantics of consumers of fo_read() and
  fo_write() to pass the active credential of the thread requesting
  an operation rather than the cached file cred.  The cached file
  cred is still available in fo_read() and fo_write() consumers
  via fp->f_cred.  These changes largely in sys_generic.c.

For each implementation of fo_read() and fo_write(), update cred
usage to reflect this change and maintain current semantics:

- badfo_readwrite() unchanged
- kqueue_read/write() unchanged
  pipe_read/write() now authorize MAC using active_cred rather
  than td->td_ucred
- soo_read/write() unchanged
- vn_read/write() now authorize MAC using active_cred but
  VOP_READ/WRITE() with fp->f_cred

Modify vn_rdwr() to accept two credential arguments instead of a
single credential: active_cred and file_cred.  Use active_cred
for MAC authorization, and select a credential for use in
VOP_READ/WRITE() based on whether file_cred is NULL or not.  If
file_cred is provided, authorize the VOP using that cred,
otherwise the active credential, matching current semantics.

Modify current vn_rdwr() consumers to pass a file_cred if used
in the context of a struct file, and to always pass active_cred.
When vn_rdwr() is used without a file_cred, pass NOCRED.

These changes should maintain current semantics for read/write,
but avoid a redundant passing of fp->f_cred, as well as making
it more clear what the origin of each credential is in file
descriptor read/write operations.

Follow-up commits will make similar changes to other file descriptor
operations, and modify the MAC framework to pass both credentials
to MAC policy modules so they can implement either semantic for
revocation.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-08-15 20:55:08 +00:00
Benno Rice
7ade8bb67c Changes for KSE3.
Submitted by:	Peter Grehan <peterg@ptree32.com.au>
2002-07-09 12:57:23 +00:00
Jake Burkholder
8ba3d077ff Add an MD callout like cpu_exit, but which is called after sched_lock is
obtained, when all other scheduling activity is suspended.  This is needed
on sparc64 to deactivate the vmspace of the exiting process on all cpus.
Otherwise if another unrelated process gets the exact same vmspace structure
allocated to it (same address), its address space will not be activated
properly.  This seems to fix some spontaneous signal 11 problems with smp
on sparc64.
2002-06-24 15:48:02 +00:00
Benno Rice
60ead00ef8 - Move macros that represent where syscall args are kept in a trapframe from
trap.c to frame.h
- Use the macros in vm_machdep.c:cpu_fork() to set up the trap frame of the
  new thread.
2002-05-28 12:24:29 +00:00
Alan Cox
6ef4be047a Use the MI vm_map_growstack() instead of the MD grow_stack() in trap(). Remove
the MD grow_stack().
2002-03-30 20:44:31 +00:00
Alfred Perlstein
812344bc0b Remove __P.
Reveiwed by: benno
2002-03-20 23:17:50 +00:00
Benno Rice
3301d20ad9 GC an unused variable in cpu_fork(). 2002-02-28 08:48:58 +00:00
Benno Rice
4eed0cf1be Make fork work, at least for kthreads. Switching still has some issues. 2002-02-28 03:24:07 +00:00
Benno Rice
5244eac968 Complete rework of the PowerPC pmap and a number of other bits in the early
boot sequence.

The new pmap.c is based on NetBSD's newer pmap.c (for the mpc6xx processors)
which is 70% faster than the older code that the original pmap.c was based
on.  It has also been based on the framework established by jake's initial
sparc64 pmap.c.

There is no change to how far the kernel gets (it makes it to the mountroot
prompt in psim) but the new pmap code is a lot cleaner.

Obtained from:	NetBSD (pmap code)
2002-02-14 01:39:11 +00:00
Julian Elischer
079b7badea Pre-KSE/M3 commit.
this is a low-functionality change that changes the kernel to access the main
thread of a process via the linked list of threads rather than
assuming that it is embedded in the process. It IS still embeded there
but remove all teh code that assumes that in preparation for the next commit
which will actually move it out.

Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,
2002-02-07 20:58:47 +00:00
Mark Peek
94e0b85e76 Fix includes based on recent changes to lock.h, mutex.h and ktr.h. 2001-10-19 22:45:46 +00:00
Benno Rice
f3ca814471 Flesh out cpu_fork() and cpu_set_fork_handler(). This is a work in progress. 2001-10-15 12:24:43 +00:00
Matt Jacob
22883e3c68 Fix problem where a user buffer outside of the area being tested
will be corrupted.

PR:		29194
Obtained from:	Tor.Egge@fast.no
MFC after:	2 weeks
2001-10-02 18:34:20 +00:00
Mark Peek
5fd2c51edb Update PowerPC MD code to compile and do initial bootstrap based on
recent changes (KSE and VM requiring physmem to be setup).

Reviewed by:	benno, jhb, julian
2001-09-20 00:47:17 +00:00
Peter Wemm
eb30c1c0b9 Rip some well duplicated code out of cpu_wait() and cpu_exit() and move
it to the MI area.  KSE touched cpu_wait() which had the same change
replicated five ways for each platform.  Now it can just do it once.
The only MD parts seemed to be dealing with fpu state cleanup and things
like vm86 cleanup on x86.  The rest was identical.

XXX: ia64 and powerpc did not have cpu_throw(), so I've put a functional
stub in place.

Reviewed by:	jake, tmm, dillon
2001-09-10 04:28:58 +00:00
Peter Wemm
660c5377fd Missing part of dillon's coredump commit. cpu_coredump() was still
passing IO_NODELOCKED to vn_rdwr(), this would cause operations on the
unlocked core vnode and softupdates nastiness if an a.out binary cored.
2001-09-08 22:18:58 +00:00
Peter Wemm
b53f9c45f9 Nuke #if 0'ed "setredzone()" stub. We never used it, and probably
never will.  I've implemented an optional redzone as part of the KSE
upage breakup.
2001-09-04 08:36:46 +00:00
John Baldwin
8c8a7645c7 Axe stale mp_fixme(). 2001-09-01 00:49:29 +00:00
Matthew Dillon
7197571105 Move vm_page_zero_idle() from machine-dependant sections to a
machine-independant source file, vm/vm_zeroidle.c.  It was exactly the
same for all platforms and updating them all was getting annoying.
2001-07-05 01:32:42 +00:00
Matthew Dillon
6d03d577a5 Reorg vm_page.c into vm_page.c, vm_pageq.c, and vm_contig.c (for contigmalloc).
Also removed some spl's and added some VM mutexes, but they are not actually
used yet, so this commit does not really make any operational changes
to the system.

vm_page.c relates to vm_page_t manipulation, including high level deactivation,
activation, etc...  vm_pageq.c relates to finding free pages and aquiring
exclusive access to a page queue (exclusivity part not yet implemented).
And the world still builds... :-)
2001-07-04 23:27:09 +00:00
Matthew Dillon
0cddd8f023 With Alfred's permission, remove vm_mtx in favor of a fine-grained approach
(this commit is just the first stage).  Also add various GIANT_ macros to
formalize the removal of Giant, making it easy to test in a more piecemeal
fashion. These macros will allow us to test fine-grained locks to a degree
before removing Giant, and also after, and to remove Giant in a piecemeal
fashion via sysctl's on those subsystems which the authors believe can
operate without Giant.
2001-07-04 16:20:28 +00:00
John Baldwin
d2a5bcc3d3 Allow Giant to be recursed when a process terminates. 2001-07-03 05:09:48 +00:00
Benno Rice
d27f1d4c12 This commit (along with one pending in sys/dev/ofw and one in sys/conf) give
us our first minimal glimpse of PowerPC support.

With this code we can get to the "mountroot>" prompt on my Apple iMac.  We
can't get any further due to lack of clock and interrupt handling, among other
things.  This does however mean that pmap and VM are initialising.

We're fairly dependant on OpenFirmware at this point, but I hope to add
support for other classes of firmware at a later stage.

Reviewed by:	obrien, dfr
2001-06-16 07:14:07 +00:00
John Baldwin
d0605c94ed GC #if 0'd calls to releasing and acquiring the old style giant kernel
lock.
2001-05-29 23:35:48 +00:00
Andrew Gallatin
2ee7c91e2e catch these files up to their i386 neighbors to make alpha boot
prior to the vm_mtx
2001-05-21 16:04:24 +00:00
John Baldwin
bf4c03d0e9 Initialize p_md.md_kernnest to 1 for newly fork'd processes since they
start off in the kernel.
2001-04-26 23:52:40 +00:00
Andrew Gallatin
64007795de remove bogus check -- for kernel threads we fork off of proc0, not curproc
This was causing panics when modules which create kthreads were loaded
after boot.

pointed out by: jake, jhb
2001-03-15 02:32:26 +00:00
John Baldwin
ff655691d8 Use the proc lock to protect p_pptr when waking up our parent in cpu_exit()
and remove the mpfixme() message that is now fixed.
2001-03-07 03:20:15 +00:00
John Baldwin
938f15c7c4 Rename switch_trampoline() to fork_trampoline() on the alpha and ia64.
Suggested by:	dfr
2001-02-22 16:56:53 +00:00
John Baldwin
5813dc03bd - Don't call clear_resched() in userret(), instead, clear the resched flag
in mi_switch() just before calling cpu_switch() so that the first switch
  after a resched request will satisfy the request.
- While I'm at it, move a few things into mi_switch() and out of
  cpu_switch(), specifically set the p_oncpu and p_lastcpu members of
  proc in mi_switch(), and handle the sched_lock state change across a
  context switch in mi_switch().
- Since cpu_switch() no longer handles the sched_lock state change, we
  have to setup an initial state for sched_lock in fork_exit() before we
  release it.
2001-02-20 05:26:15 +00:00
Bosko Milekic
9ed346bab0 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
John Baldwin
a0346459f1 Update some comments, s0 in the pcb of a child returning from fork1() is
now passed in as a0 to fork_exit() and and s2 is passed in as a1.
2001-01-26 23:32:38 +00:00
John Baldwin
2a36ec35ae - Change fork_exit() to take a pointer to a trapframe as its 3rd argument
instead of a trapframe directly.  (Requested by bde.)
- Convert the alpha switch_trampoline to call fork_exit() and use the MI
  fork_return() instead of child_return().
- Axe child_return().
2001-01-24 21:59:25 +00:00
Jake Burkholder
98f03f9030 Protect proc.p_pptr and proc.p_children/p_sibling with the
proctree_lock.

linprocfs not locked pending response from informal maintainer.

Reviewed by:	jhb, -smp@
2000-12-23 19:43:10 +00:00
Andrew Gallatin
4c0b7a9327 acquire/release Giant in vm_page_zero_idle(), like on i386
Discused with: jhb
2000-12-01 18:55:58 +00:00
Doug Rabson
35cd89e7e5 Convert various calls to splhigh() to disable_intr() since splhigh() is
now a no-op.
2000-11-19 12:28:42 +00:00
John Baldwin
7e4b7c97de Don't perform an mi_switch() when we release Giant during cpu_exit(). We
are about to call cpu_switch() anyways.

Found by:	witness
2000-11-15 19:44:38 +00:00
John Baldwin
8088699f79 - Overhaul the software interrupt code to use interrupt threads for each
type of software interrupt.  Roughly, what used to be a bit in spending
  now maps to a swi thread.  Each thread can have multiple handlers, just
  like a hardware interrupt thread.
- Instead of using a bitmask of pending interrupts, we schedule the specific
  software interrupt thread to run, so spending, NSWI, and the shandlers
  array are no longer needed.  We can now have an arbitrary number of
  software interrupt threads.  When you register a software interrupt
  thread via sinthand_add(), you get back a struct intrhand that you pass
  to sched_swi() when you wish to schedule your swi thread to run.
- Convert the name of 'struct intrec' to 'struct intrhand' as it is a bit
  more intuitive.  Also, prefix all the members of struct intrhand with
  'ih_'.
- Make swi_net() a MI function since there is now no point in it being
  MD.

Submitted by:	cp
2000-10-25 05:19:40 +00:00
John Baldwin
35e0e5b311 Catch up to moving headers:
- machine/ipl.h -> sys/ipl.h
- machine/mutex.h -> sys/mutex.h
2000-10-20 07:58:15 +00:00
Doug Rabson
c703143638 Clear pcb_schednest in cpu_fork() for the child process. This is
is necessary since the child's call stack only includes one recursive
hold of sched_lock.
2000-10-03 08:03:03 +00:00
Jason Evans
0384fff8c5 Major update to the way synchronization is done in the kernel. Highlights
include:

* Mutual exclusion is used instead of spl*().  See mutex(9).  (Note: The
  alpha port is still in transition and currently uses both.)

* Per-CPU idle processes.

* Interrupts are run in their own separate kernel threads and can be
  preempted (i386 only).

Partially contributed by:	BSDi (BSD/OS)
Submissions by (at least):	cp, dfr, dillon, grog, jake, jhb, sheldonh
2000-09-07 01:33:02 +00:00