mutex releases to not require flags for the cases when preemption is
not allowed:
The purpose of the MTX_NOSWITCH and SWI_NOSWITCH flags is to prevent
switching to a higher priority thread on mutex releease and swi schedule,
respectively when that switch is not safe. Now that the critical section
API maintains a per-thread nesting count, the kernel can easily check
whether or not it should switch without relying on flags from the
programmer. This fixes a few bugs in that all current callers of
swi_sched() used SWI_NOSWITCH, when in fact, only the ones called from
fast interrupt handlers and the swi_sched of softclock needed this flag.
Note that to ensure that swi_sched()'s in clock and fast interrupt
handlers do not switch, these handlers have to be explicitly wrapped
in critical_enter/exit pairs. Presently, just wrapping the handlers is
sufficient, but in the future with the fully preemptive kernel, the
interrupt must be EOI'd before critical_exit() is called. (critical_exit()
can switch due to a deferred preemption in a fully preemptive kernel.)
I've tested the changes to the interrupt code on i386 and alpha. I have
not tested ia64, but the interrupt code is almost identical to the alpha
code, so I expect it will work fine. PowerPC and ARM do not yet have
interrupt code in the tree so they shouldn't be broken. Sparc64 is
broken, but that's been ok'd by jake and tmm who will be fixing the
interrupt code for sparc64 shortly.
Reviewed by: peter
Tested on: i386, alpha
and it's associated state variables: icu_lock with the name "icu". This
renames the imen_mtx for x86 SMP, but also uses the lock to protect
access to the 8259 PIC on x86 UP. This also adds an appropriate lock to
the various Alpha chipsets which fixes problems with Alpha SMP machines
dropping interrupts with an SMP kernel.
code in ipl.s and icu_ipl.s that used them was removed when the
interrupt thread system was committed. Debuggers also knew about
Xresume* because these labels hide the real names of the interrupt
handlers (Xintr*), and debuggers need to special-case interrupt
handlers to get the interrupt frame.
Both gdb and ddb will now use the Xintr* and Xfastintr* symbols to
detect interrupt frames. Fast interrupt frames were never identified
correctly before, so this fixes the problem of the running stack
frame getting lost in a ddb or gdb trace generated from a fast
interrupt - e.g. when debugging a simple infinite loop in the kernel
using a serial console, the frame containing the loop would never
appear in a gdb or ddb trace.
Reviewed by: jhb, bde
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
been made machine independent and various other adjustments have been made
to support Alpha SMP.
- It splits the per-process portions of hardclock() and statclock() off
into hardclock_process() and statclock_process() respectively. hardclock()
and statclock() call the *_process() functions for the current process so
that UP systems will run as before. For SMP systems, it is simply necessary
to ensure that all other processors execute the *_process() functions when the
main clock functions are triggered on one CPU by an interrupt. For the alpha
4100, clock interrupts are delievered in a staggered broadcast fashion, so
we simply call hardclock/statclock on the boot CPU and call the *_process()
functions on the secondaries. For x86, we call statclock and hardclock as
usual and then call forward_hardclock/statclock in the MD code to send an IPI
to cause the AP's to execute forwared_hardclock/statclock which then call the
*_process() functions.
- forward_signal() and forward_roundrobin() have been reworked to be MI and to
involve less hackery. Now the cpu doing the forward sets any flags, etc. and
sends a very simple IPI_AST to the other cpu(s). AST IPIs now just basically
return so that they can execute ast() and don't bother with setting the
astpending or needresched flags themselves. This also removes the loop in
forward_signal() as sched_lock closes the race condition that the loop worked
around.
- need_resched(), resched_wanted() and clear_resched() have been changed to take
a process to act on rather than assuming curproc so that they can be used to
implement forward_roundrobin() as described above.
- Various other SMP variables have been moved to a MI subr_smp.c and a new
header sys/smp.h declares MI SMP variables and API's. The IPI API's from
machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h.
- The globaldata_register() and globaldata_find() functions as well as the
SLIST of globaldata structures has become MI and moved into subr_smp.c.
Also, the globaldata list is only available if SMP support is compiled in.
Reviewed by: jake, peter
Looked over by: eivind
and used in C or vice versa. The elf compiler uses the same names
for both. Remove asnames.h with great prejudice; it has served its
purpose.
Note that this does not affect the ability to generate an aout kernel
due to gcc's -mno-underscores option.
moral support from: peter, jhb
attributes. This is needed for AST's to be properly posted in a preemptive
kernel. They are backed by two new flags in p_sflag: PS_ASTPENDING and
PS_NEEDRESCHED. They are still accesssed by their old macros:
aston(), astoff(), etc. For completeness, an astpending() macro has been
added to check for a pending AST, and clear_resched() has been added to
clear need_resched().
- Rename syscall2() on the x86 back to syscall() to be consistent with
other architectures.
interrupt threads to run with it always >= 1, so that malloc can
detect M_WAITOK from "interrupt" context. This is also necessary
in order to context switch from sched_ithd() directly.
Reviewed By: peter
variables from i386 assembly language. The syntax is PCPU(member)
where member is the capitalized name of the per-cpu variable, without
the gd_ prefix. Example: movl %eax,PCPU(CURPROC). The capitalization
is due to using the offsets generated by genassym rather than the symbols
provided by linking with globals.o. asmacros.h is the wrong place for
this but it seemed as good a place as any for now. The old implementation
in asnames.h has not been removed because it is still used to de-mangle
the symbols used by the C variables for the UP case.
Also, while here, run up to 32 interrupt sources on APIC systems.
Normalize INTREN/INTRDIS so they are the same on both UP and SMP systems
rather than sometimes a macro, and sometimes a function.
Reviewed by: jhb, jakeb
return through doreti to handle ast's. This is necessary for the
clock interrupts to work properly.
- Change the clock interrupts on the x86 to be fast instead of threaded.
This is needed because both hardclock() and statclock() need to run in
the context of the current process, not in a separate thread context.
- Kill the prevproc hack as it is no longer needed.
- We really need Giant when we call psignal(), but we don't want to block
during the clock interrupt. Instead, use two p_flag's in the proc struct
to mark the current process as having a pending SIGVTALRM or a SIGPROF
and let them be delivered during ast() when hardclock() has finished
running.
- Remove CLKF_BASEPRI, which was #ifdef'd out on the x86 anyways. It was
broken on the x86 if it was turned on since cpl is gone. It's only use
was to bogusly run softclock() directly during hardclock() rather than
scheduling an SWI.
- Remove the COM_LOCK simplelock and replace it with a clock_lock spin
mutex. Since the spin mutex already handles disabling/restoring
interrupts appropriately, this also lets us axe all the *_intr() fu.
- Back out the hacks in the APIC_IO x86 cpu_initclocks() code to use
temporary fast interrupts for the APIC trial.
- Add two new process flags P_ALRMPEND and P_PROFPEND to mark the pending
signals in hardclock() that are to be delivered in ast().
Submitted by: jakeb (making statclock safe in a fast interrupt)
Submitted by: cp (concept of delaying signals until ast())
- Make softinterrupts (SWI's) almost completely MI, and divorce them
completely from the x86 hardware interrupt code.
- The ihandlers array is now gone. Instead, there is a MI shandlers array
that just contains SWI handlers.
- Most of the former machine/ipl.h files have moved to a new sys/ipl.h.
- Stub out all the spl*() functions on all architectures.
Submitted by: dfr
include:
* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The
alpha port is still in transition and currently uses both.)
* Per-CPU idle processes.
* Interrupts are run in their own separate kernel threads and can be
preempted (i386 only).
Partially contributed by: BSDi (BSD/OS)
Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
have their own lock and do not need the MP lock. The SMP cleanup was
a little too conservative in MP locking fast interrupts but at least
it's trivial to fix. MFC soon.
Submitted by: bde
operands. `movw' could be used, but instead let the assembler decide
the right instruction to use.
2. AT&T asm syntax requires a leading '*' in front of the operand for
indirect calls and jumps.
removed unncessary MPLOCKED and 'lock' prefixes from the interrupt
nesting level, since (A) the MP lock is held at the time, and (B) since
the neting level is restored prior to return any interrupted code
will see a consistent value.
syscall path inward. A system call may select whether it needs the MP
lock or not (the default being that it does need it).
A great deal of conditional SMP code for various deadended experiments
has been removed. 'cil' and 'cml' have been removed entirely, and the
locking around the cpl has been removed. The conditional
separately-locked fast-interrupt code has been removed, meaning that
interrupts must hold the CPL now (but they pretty much had to anyway).
Another reason for doing this is that the original separate-lock for
interrupts just doesn't apply to the interrupt thread mechanism being
contemplated.
Modifications to the cpl may now ONLY occur while holding the MP
lock. For example, if an otherwise MP safe syscall needs to mess with
the cpl, it must hold the MP lock for the duration and must (as usual)
save/restore the cpl in a nested fashion.
This is precursor work for the real meat coming later: avoiding having
to hold the MP lock for common syscalls and I/O's and interrupt threads.
It is expected that the spl mechanisms and new interrupt threading
mechanisms will be able to run in tandem, allowing a slow piecemeal
transition to occur.
This patch should result in a moderate performance improvement due to
the considerable amount of code that has been removed from the critical
path, especially the simplification of the spl*() calls. The real
performance gains will come later.
Approved by: jkh
Reviewed by: current, bde (exception.s)
Some work taken from: luoqi's patch
to use a locked cmpexg when unlocking a lock that we already hold, since
nobody else can touch the lock while we hold it. Second, it is not
necessary to use a locked cmpexg when locking a lock that we already
hold, for the same reason. These changes will allow MP locks to be used
recursively without impacting performance.
Modify two procedures that are called only by assembly and are already
NOPROF entries to pass a critical argument in %edx instead of on the
stack, removing a significant amount of code from the critical path
as a consequence.
Reviewed by: Alfred Perlstein <bright@wintelcom.net>, Peter Wemm <peter@netplex.com.au>
the caller to specify a function to be guarded between an entry and exit
barrier, as well as pre- and post-barrier functions.
The primary use for this function is synchronised update of per-cpu private
data. The implementation is almost (but not quite) MI; with a better
mechanism for masking per-CPU interrupts it could probably be hoisted.
Reviewed by: peter (partially)
The old version only worked right when the time was read strictly
more often than every 1/HZ seconds, but we only guarantee reading
it every (1/HZ + epsilon) seconds. Part of rev.1.126-1.127 attempted
to fix this but didn't succeed. Detect counter rollover using the
heuristic from the old version of microtime() with additional
complications for supporting calls from fast interrupt handlers.
This works provided i8254 interrupts are not delayed by more than
1/(2*HZ) seconds.
This needs more comments, and cleanups for the SMP case, and more
testing of the SMP case before it is merged into RELENG_3.
Tested by: jhay
- %fs register is added to trapframe and saved/restored upon kernel entry/exit.
- Per-cpu pages are no longer mapped at the same virtual address.
- Each cpu now has a separate gdt selector table. A new segment selector
is added to point to per-cpu pages, per-cpu global variables are now
accessed through this new selector (%fs). The selectors in gdt table are
rearranged for cache line optimization.
- fask_vfork is now on as default for both UP and SMP.
- Some aio code cleanup.
Reviewed by: Alan Cox <alc@cs.rice.edu>
John Dyson <dyson@iquest.net>
Julian Elischer <julian@whistel.com>
Bruce Evans <bde@zeta.org.au>
David Greenman <dg@root.com>
and use this when masking/unmasking interrupts.
Maintain a mapping from (iopaic number, int pin) tuple to irq number,
and use this when configuring devices and programming the ioapics.
Previous code assumed that irq number was equal to int pin number, and
that the ioapic number was 0.
Don't let an AP enter _cpu_switch before all local apics are initialized.
instead of at compile time using ifdefs.
Use _swi_null instead of dummycamisr. CAM and dpt should call
register_swi() instead of hacking on ihandlers[] directly.
update of cpu usage as shown by top when one process is cpu bound
(no system calls) while the system is otherwise idle (except for top).
Don't attempt to switch to the BSP in boot(). If the system was idle when
an interrupt caused a panic, this won't work. Instead, switch to the BSP
in cpu_reset.
Remove some spurious forward_statclock/forward_hardclock warnings.
interrupts are masked, and EOI is sent iff the corresponding ISR bit
is set in the local apic. If the CPU cannot obtain the interrupt
service lock (currently the global kernel lock) the interrupt is
forwarded to the CPU holding that lock.
Clock interrupts now have higher priority than other slow interrupts.
the signal handling latency for cpu-bound processes that performs very
few system calls.
The IPI for forcing an additional software trap is no longer dependent upon
BETTER_CLOCK being defined.