freebsd-skq/sys/kern/subr_trap.c

193 lines
5.6 KiB
C
Raw Normal View History

1993-06-12 14:58:17 +00:00
/*-
* Copyright (C) 1994, David Greenman
* Copyright (c) 1990, 1993
* The Regents of the University of California. All rights reserved.
1993-06-12 14:58:17 +00:00
*
* This code is derived from software contributed to Berkeley by
* the University of Utah, and William Jolitz.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. All advertising materials mentioning features or use of this software
* must display the following acknowledgement:
* This product includes software developed by the University of
* California, Berkeley and its contributors.
* 4. Neither the name of the University nor the names of its contributors
* may be used to endorse or promote products derived from this software
* without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*
genassym.c: Remove NKMEMCLUSTERS, it is no longer define or used. locores.s: Fix comment on PTDpde and APTDpde to be pde instead of pte Add new equation for calculating location of Sysmap Remove Bill's old #ifdef garbage for counting up memory, that stuff will never be made to work and was just cluttering up the file. Add code that places the PTD, page table pages, and kernel stack below the 640k ISA hole if there is room for it, otherwise put this stuff all at 1MB. This fixes the 28K bogusity in the boot blocks, that can now go away! Fix the caclulation of where first is to be dependent on NKPDE so that we can skip over the above mentioned areas. The 28K thing is now 44K in size due to the increase in kernel virtual memory space, but since we no longer have to worry about that this is no big deal. Use if NNPX > 0 instead of ifdef NPX for floating point code. machdep.c Change the calculation of for the buffer cache to be 20% of all memory above 2MB and add back the upper limit of 2/5's of the VM_KMEM_SIZE so that we do not eat ALL of the kernel memory space on large memory machines, note that this will not even come into effect unless you have more than 32MB. The current buffer cache limit is 6.7MB due to this caclulation. It seems that we where erroniously allocating bufpages pages for buffer_map. buffer_map is UNUSED in this implementation of the buffer cache, but since the map is referenced in several if statements a quick fix was to simply allocate 1 vm page (but no real memory) to it. pmap.h Remove rcsid, don't want them in the kernel files! Removed some cruft inside an #ifdef DEBUGx that caused compiler errors if you where compiling this for debug. Use the #defines for PD_SHIFT and PG_SHIFT in place of constants. trap.c: Remove patch kit header and rcsid, fix $Id$. Now include "npx.h" and use NNPX for controlling the floating point code. Remove a now completly invalid check for a maximum virtual address, the virtual address now ends at 0xFFFFFFFF so there is no more MAX!! (Thanks David, I completly missed that one!) vm_machdep.c Remove patch kit header and rcsid, fix $Id$. Now include "npx.h" and use NNPX for controlling the floating point code. Replace several 0xFE00000 constants with KERNBASE
1993-10-15 10:34:29 +00:00
* from: @(#)trap.c 7.4 (Berkeley) 5/13/91
1999-08-28 01:08:13 +00:00
* $FreeBSD$
1993-06-12 14:58:17 +00:00
*/
#ifdef __i386__
#include "opt_npx.h"
#endif
#include <sys/param.h>
#include <sys/bus.h>
#include <sys/kernel.h>
#include <sys/lock.h>
#include <sys/mutex.h>
#include <sys/proc.h>
#include <sys/resourcevar.h>
#include <sys/signalvar.h>
#include <sys/systm.h>
#include <sys/vmmeter.h>
#include <machine/cpu.h>
#include <machine/pcb.h>
/*
* Define the code needed before returning to user mode, for
* trap and syscall.
*/
void
userret(p, frame, oticks)
struct proc *p;
struct trapframe *frame;
u_int oticks;
{
int sig;
mtx_lock(&Giant);
PROC_LOCK(p);
while ((sig = CURSIG(p)) != 0)
postsig(sig);
mtx_unlock(&Giant);
PROC_UNLOCK(p);
Commit major SMP cleanups and move the BGL (big giant lock) in the syscall path inward. A system call may select whether it needs the MP lock or not (the default being that it does need it). A great deal of conditional SMP code for various deadended experiments has been removed. 'cil' and 'cml' have been removed entirely, and the locking around the cpl has been removed. The conditional separately-locked fast-interrupt code has been removed, meaning that interrupts must hold the CPL now (but they pretty much had to anyway). Another reason for doing this is that the original separate-lock for interrupts just doesn't apply to the interrupt thread mechanism being contemplated. Modifications to the cpl may now ONLY occur while holding the MP lock. For example, if an otherwise MP safe syscall needs to mess with the cpl, it must hold the MP lock for the duration and must (as usual) save/restore the cpl in a nested fashion. This is precursor work for the real meat coming later: avoiding having to hold the MP lock for common syscalls and I/O's and interrupt threads. It is expected that the spl mechanisms and new interrupt threading mechanisms will be able to run in tandem, allowing a slow piecemeal transition to occur. This patch should result in a moderate performance improvement due to the considerable amount of code that has been removed from the critical path, especially the simplification of the spl*() calls. The real performance gains will come later. Approved by: jkh Reviewed by: current, bde (exception.s) Some work taken from: luoqi's patch
2000-03-28 07:16:37 +00:00
Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
mtx_lock_spin(&sched_lock);
Implement a unified run queue and adjust priority levels accordingly. - All processes go into the same array of queues, with different scheduling classes using different portions of the array. This allows user processes to have their priorities propogated up into interrupt thread range if need be. - I chose 64 run queues as an arbitrary number that is greater than 32. We used to have 4 separate arrays of 32 queues each, so this may not be optimal. The new run queue code was written with this in mind; changing the number of run queues only requires changing constants in runq.h and adjusting the priority levels. - The new run queue code takes the run queue as a parameter. This is intended to be used to create per-cpu run queues. Implement wrappers for compatibility with the old interface which pass in the global run queue structure. - Group the priority level, user priority, native priority (before propogation) and the scheduling class into a struct priority. - Change any hard coded priority levels that I found to use symbolic constants (TTIPRI and TTOPRI). - Remove the curpriority global variable and use that of curproc. This was used to detect when a process' priority had lowered and it should yield. We now effectively yield on every interrupt. - Activate propogate_priority(). It should now have the desired effect without needing to also propogate the scheduling class. - Temporarily comment out the call to vm_page_zero_idle() in the idle loop. It interfered with propogate_priority() because the idle process needed to do a non-blocking acquire of Giant and then other processes would try to propogate their priority onto it. The idle process should not do anything except idle. vm_page_zero_idle() will return in the form of an idle priority kernel thread which is woken up at apprioriate times by the vm system. - Update struct kinfo_proc to the new priority interface. Deliberately change its size by adjusting the spare fields. It remained the same size, but the layout has changed, so userland processes that use it would parse the data incorrectly. The size constraint should really be changed to an arbitrary version number. Also add a debug.sizeof sysctl node for struct kinfo_proc.
2001-02-12 00:20:08 +00:00
p->p_pri.pri_level = p->p_pri.pri_user;
if (p->p_sflag & PS_NEEDRESCHED) {
/*
* Since we are curproc, a clock interrupt could
* change our priority without changing run queues
* (the running process is not kept on a run queue).
* If this happened after we setrunqueue ourselves but
* before we switch()'ed, we might not be on the queue
* indicated by our priority.
*/
DROP_GIANT_NOSWITCH();
setrunqueue(p);
p->p_stats->p_ru.ru_nivcsw++;
mi_switch();
Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
mtx_unlock_spin(&sched_lock);
PICKUP_GIANT();
mtx_lock(&Giant);
PROC_LOCK(p);
while ((sig = CURSIG(p)) != 0)
postsig(sig);
mtx_unlock(&Giant);
PROC_UNLOCK(p);
} else
mtx_unlock_spin(&sched_lock);
/*
* Charge system time if profiling.
*/
if (p->p_sflag & PS_PROFIL)
addupc_task(p, TRAPF_PC(frame),
((u_int)p->p_sticks - oticks) * psratio);
}
1993-06-12 14:58:17 +00:00
/*
* Process an asynchronous software trap.
* This is relatively easy.
* This function will return with preemption disabled.
1993-06-12 14:58:17 +00:00
*/
void
ast(framep)
struct trapframe *framep;
{
struct proc *p = CURPROC;
u_int prticks, sticks;
critical_t s;
int sflag;
#if defined(DEV_NPX) && !defined(SMP)
int ucode;
#endif
KASSERT(TRAPF_USERMODE(framep), ("ast in kernel mode"));
#ifdef WITNESS
if (witness_list(p))
panic("Returning to user mode with mutex(s) held");
#endif
mtx_assert(&Giant, MA_NOTOWNED);
s = critical_enter();
while ((p->p_sflag & (PS_ASTPENDING | PS_NEEDRESCHED)) != 0) {
critical_exit(s);
p->p_frame = framep;
/*
* This updates the p_sflag's for the checks below in one
* "atomic" operation with turning off the astpending flag.
* If another AST is triggered while we are handling the
* AST's saved in sflag, the astpending flag will be set and
* we will loop again.
*/
Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
mtx_lock_spin(&sched_lock);
sticks = p->p_sticks;
sflag = p->p_sflag;
p->p_sflag &= ~(PS_OWEUPC | PS_ALRMPEND | PS_PROFPEND |
PS_ASTPENDING);
cnt.v_soft++;
if (sflag & PS_OWEUPC) {
prticks = p->p_stats->p_prof.pr_ticks;
p->p_stats->p_prof.pr_ticks = 0;
mtx_unlock_spin(&sched_lock);
addupc_task(p, p->p_stats->p_prof.pr_addr, prticks);
} else
mtx_unlock_spin(&sched_lock);
if (sflag & PS_ALRMPEND) {
PROC_LOCK(p);
psignal(p, SIGVTALRM);
PROC_UNLOCK(p);
}
#if defined(DEV_NPX) && !defined(SMP)
if (PCPU_GET(curpcb)->pcb_flags & PCB_NPXTRAP) {
atomic_clear_char(&PCPU_GET(curpcb)->pcb_flags,
PCB_NPXTRAP);
ucode = npxtrap();
if (ucode != -1) {
mtx_lock(&Giant);
trapsignal(p, SIGFPE, ucode);
}
}
#endif
if (sflag & PS_PROFPEND) {
PROC_LOCK(p);
psignal(p, SIGPROF);
PROC_UNLOCK(p);
}
The biggie: Get rid of the UPAGES from the top of the per-process address space. (!) Have each process use the kernel stack and pcb in the kvm space. Since the stacks are at a different address, we cannot copy the stack at fork() and allow the child to return up through the function call tree to return to user mode - create a new execution context and have the new process begin executing from cpu_switch() and go to user mode directly. In theory this should speed up fork a bit. Context switch the tss_esp0 pointer in the common tss. This is a lot simpler since than swithching the gdt[GPROC0_SEL].sd.sd_base pointer to each process's tss since the esp0 pointer is a 32 bit pointer, and the sd_base setting is split into three different bit sections at non-aligned boundaries and requires a lot of twiddling to reset. The 8K of memory at the top of the process space is now empty, and unmapped (and unmappable, it's higher than VM_MAXUSER_ADDRESS). Simplity the pmap code to manage process contexts, we no longer have to double map the UPAGES, this simplifies and should measuably speed up fork(). The following parts came from John Dyson: Set PG_G on the UPAGES that are now in kernel context, and invalidate them when swapping them out. Move the upages object (upobj) from the vmspace to the proc structure. Now that the UPAGES (pcb and kernel stack) are out of user space, make rfork(..RFMEM..) do what was intended by sharing the vmspace entirely via reference counting rather than simply inheriting the mappings.
1997-04-07 07:16:06 +00:00
userret(p, framep, sticks);
if (mtx_owned(&Giant))
mtx_unlock(&Giant);
s = critical_enter();
}
mtx_assert(&Giant, MA_NOTOWNED);
/*
* We need to keep interrupts disabled so that if any further AST's
* come in, the interrupt they come in on will be delayed until we
* finish returning to userland. We assume that the return to userland
* will perform the equivalent of critical_exit().
*/
The biggie: Get rid of the UPAGES from the top of the per-process address space. (!) Have each process use the kernel stack and pcb in the kvm space. Since the stacks are at a different address, we cannot copy the stack at fork() and allow the child to return up through the function call tree to return to user mode - create a new execution context and have the new process begin executing from cpu_switch() and go to user mode directly. In theory this should speed up fork a bit. Context switch the tss_esp0 pointer in the common tss. This is a lot simpler since than swithching the gdt[GPROC0_SEL].sd.sd_base pointer to each process's tss since the esp0 pointer is a 32 bit pointer, and the sd_base setting is split into three different bit sections at non-aligned boundaries and requires a lot of twiddling to reset. The 8K of memory at the top of the process space is now empty, and unmapped (and unmappable, it's higher than VM_MAXUSER_ADDRESS). Simplity the pmap code to manage process contexts, we no longer have to double map the UPAGES, this simplifies and should measuably speed up fork(). The following parts came from John Dyson: Set PG_G on the UPAGES that are now in kernel context, and invalidate them when swapping them out. Move the upages object (upobj) from the vmspace to the proc structure. Now that the UPAGES (pcb and kernel stack) are out of user space, make rfork(..RFMEM..) do what was intended by sharing the vmspace entirely via reference counting rather than simply inheriting the mappings.
1997-04-07 07:16:06 +00:00
}