enabled in critical sections and streamline critical_enter() and
critical_exit().
This commit allows an architecture to leave interrupts enabled inside
critical sections if it so wishes. Architectures that do not wish to do
this are not effected by this change.
This commit implements the feature for the I386 architecture and provides
a sysctl, debug.critical_mode, which defaults to 1 (use the feature). For
now you can turn the sysctl on and off at any time in order to test the
architectural changes or track down bugs.
This commit is just the first stage. Some areas of the code, specifically
the MACHINE_CRITICAL_ENTER #ifdef'd code, is strictly temporary and will
be cleaned up in the STAGE-2 commit when the critical_*() functions are
moved entirely into MD files.
The following changes have been made:
* critical_enter() and critical_exit() for I386 now simply increment
and decrement curthread->td_critnest. They no longer disable
hard interrupts. When critical_exit() decrements the counter to
0 it effectively calls a routine to deal with whatever interrupts
were deferred during the time the code was operating in a critical
section.
Other architectures are unaffected.
* fork_exit() has been conditionalized to remove MD assumptions for
the new code. Old code will still use the old MD assumptions
in regards to hard interrupt disablement. In STAGE-2 this will
be turned into a subroutine call into MD code rather then hardcoded
in MI code.
The new code places the burden of entering the critical section
in the trampoline code where it belongs.
* I386: interrupts are now enabled while we are in a critical section.
The interrupt vector code has been adjusted to deal with the fact.
If it detects that we are in a critical section it currently defers
the interrupt by adding the appropriate bit to an interrupt mask.
* In order to accomplish the deferral, icu_lock is required. This
is i386-specific. Thus icu_lock can only be obtained by mainline
i386 code while interrupts are hard disabled. This change has been
made.
* Because interrupts may or may not be hard disabled during a
context switch, cpu_switch() can no longer simply assume that
PSL_I will be in a consistent state. Therefore, it now saves and
restores eflags.
* FAST INTERRUPT PROVISION. Fast interrupts are currently deferred.
The intention is to eventually allow them to operate either while
we are in a critical section or, if we are able to restrict the
use of sched_lock, while we are not holding the sched_lock.
* ICU and APIC vector assembly for I386 cleaned up. The ICU code
has been cleaned up to match the APIC code in regards to format
and macro availability. Additionally, the code has been adjusted
to deal with deferred interrupts.
* Deferred interrupts use a per-cpu boolean int_pending, and
masks ipending, spending, and fpending. Being per-cpu variables
it is not currently necessary to lock; bus cycles modifying them.
Note that the same mechanism will enable preemption to be
incorporated as a true software interrupt without having to
further hack up the critical nesting code.
* Note: the old critical_enter() code in kern/kern_switch.c is
currently #ifdef to be compatible with both the old and new
methodology. In STAGE-2 it will be moved entirely to MD code.
Performance issues:
One of the purposes of this commit is to enhance critical section
performance, specifically to greatly reduce bus overhead to allow
the critical section code to be used to protect per-cpu caches.
These caches, such as Jeff's slab allocator work, can potentially
operate very quickly making the effective savings of the new
critical section code's performance very significant.
The second purpose of this commit is to allow architectures to
enable certain interrupts while in a critical section. Specifically,
the intention is to eventually allow certain FAST interrupts to
operate rather then defer.
The third purpose of this commit is to begin to clean up the
critical_enter()/critical_exit()/cpu_critical_enter()/
cpu_critical_exit() API which currently has serious cross pollution
in MI code (in fork_exit() and ast() for example).
The fourth purpose of this commit is to provide a framework that
allows kernel-preempting software interrupts to be implemented
cleanly. This is currently used for two forward interrupts in I386.
Other architectures will have the choice of using this infrastructure
or building the functionality directly into critical_enter()/
critical_exit().
Finally, this commit is designed to greatly improve the flexibility
of various architectures to manage critical section handling,
software interrupts, preemption, and other highly integrated
architecture-specific details.
on for a while:
- fine grained TLB shootdown for SMP on i386
- ranged TLB shootdowns.. eg: specify a range of pages to shoot down with
a single IPI, since the IPI is very expensive. Adjust some callers
that used to trigger this inside tight loops to do a ranged shootdown
at the end instead.
- PG_G support for SMP on i386 (options ENABLE_PG_G)
- defer PG_G activation till after we decide what we are going to do with
PSE and the 4MB pages at the start of the kernel. This should solve
some rumored strangeness about stale PG_G entries getting stuck
underneath the 4MB pages.
- add some instrumentation for the fine TLB shootdown
- convert some asm instruction wrappers from functions to inlines. gcc
seems to do a fair bit better with this.
- [temporarily!] pessimize the tlb shootdown IPI handlers. I will fix
this again shortly.
This has been working fairly well for me for a while, but I have tweaked
it again prior to commit since my last major testing round. The only
outstanding problem that I know of is PG_G related, which is why there
is an option for it (not on by default for SMP). I have seen a world
speedups by a few percent (as much as 4 or 5% in one case) but I have
*not* accurately measured this - I am a bit sceptical of these numbers.
is not configured. Including <isa/isavar.h> when it is not used is
harmful as well as bogus, since it includes "isa_if.h" which is not
generated when isa is not configured.
This was fixed in 1999 but was broken by unconditionalizing PNPBIOS.
mutex releases to not require flags for the cases when preemption is
not allowed:
The purpose of the MTX_NOSWITCH and SWI_NOSWITCH flags is to prevent
switching to a higher priority thread on mutex releease and swi schedule,
respectively when that switch is not safe. Now that the critical section
API maintains a per-thread nesting count, the kernel can easily check
whether or not it should switch without relying on flags from the
programmer. This fixes a few bugs in that all current callers of
swi_sched() used SWI_NOSWITCH, when in fact, only the ones called from
fast interrupt handlers and the swi_sched of softclock needed this flag.
Note that to ensure that swi_sched()'s in clock and fast interrupt
handlers do not switch, these handlers have to be explicitly wrapped
in critical_enter/exit pairs. Presently, just wrapping the handlers is
sufficient, but in the future with the fully preemptive kernel, the
interrupt must be EOI'd before critical_exit() is called. (critical_exit()
can switch due to a deferred preemption in a fully preemptive kernel.)
I've tested the changes to the interrupt code on i386 and alpha. I have
not tested ia64, but the interrupt code is almost identical to the alpha
code, so I expect it will work fine. PowerPC and ARM do not yet have
interrupt code in the tree so they shouldn't be broken. Sparc64 is
broken, but that's been ok'd by jake and tmm who will be fixing the
interrupt code for sparc64 shortly.
Reviewed by: peter
Tested on: i386, alpha
and it's associated state variables: icu_lock with the name "icu". This
renames the imen_mtx for x86 SMP, but also uses the lock to protect
access to the 8259 PIC on x86 UP. This also adds an appropriate lock to
the various Alpha chipsets which fixes problems with Alpha SMP machines
dropping interrupts with an SMP kernel.
- The MD functions critical_enter/exit are renamed to start with a cpu_
prefix.
- MI wrapper functions critical_enter/exit maintain a per-thread nesting
count and a per-thread critical section saved state set when entering
a critical section while at nesting level 0 and restored when exiting
to nesting level 0. This moves the saved state out of spin mutexes so
that interlocking spin mutexes works properly.
- Most low-level MD code that used critical_enter/exit now use
cpu_critical_enter/exit. MI code such as device drivers and spin
mutexes use the MI wrappers. Note that since the MI wrappers store
the state in the current thread, they do not have any return values or
arguments.
- mtx_intr_enable() is replaced with a constant CRITICAL_FORK which is
assigned to curthread->td_savecrit during fork_exit().
Tested on: i386, alpha
- The MI portions of struct globaldata have been consolidated into a MI
struct pcpu. The MD per-CPU data are specified via a macro defined in
machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the
interface would be cleaner (PCPU_GET(my_md_field) vs.
PCPU_GET(md.md_my_md_field)).
- All references to globaldata are changed to pcpu instead. In a UP kernel,
this data was stored as global variables which is where the original name
came from. In an SMP world this data is per-CPU and ideally private to each
CPU outside of the context of debuggers. This also included combining
machine/globaldata.h and machine/globals.h into machine/pcpu.h.
- The pointer to the thread using the FPU on i386 was renamed from
npxthread to fpcurthread to be identical with other architectures.
- Make the show pcpu ddb command MI with a MD callout to display MD
fields.
- The globaldata_register() function was renamed to pcpu_init() and now
init's MI fields of a struct pcpu in addition to registering it with
the internal array and list.
- A pcpu_destroy() function was added to remove a struct pcpu from the
internal array and list.
Tested on: alpha, i386
Reviewed by: peter, jake
- Now that apm loadable module can inform its existence to other kernel
components (e.g. i386/isa/clock.c:startrtclock()'s TCS hack).
- Exchange priority of SI_SUB_CPU and SI_SUB_KLD for above purpose.
- Add simple arbitration mechanism for APM vs. ACPI. This prevents
the kernel enables both of them.
- Remove obsolete `#ifdef DEV_APM' related code.
- Add abstracted interface for Powermanagement operations. Public apm(4)
functions, such as apm_suspend(), should be replaced new interfaces.
Currently only power_pm_suspend (successor of apm_suspend) is implemented.
Reviewed by: peter, arch@ and audit@
- sys/pc98/pc98/npx.c 1.87 (2001/09/15; author: imp)
I don't think pc98 has acpi at all, so ifdef the acpi attachments for
now.
This completes merging sys/pc98/pc98/npx.c into sys/i386/isa/npx.c so
that the former can be removed.
and the irq are different for pc98, and are not very well handled (we
use a historical mess of hard-coded values, values from header files
and values from hints).
- 1.58 (2000/09/01; author: kato)
Fixed FPU_ERROR_BROKEN code. It had old-isa code.
- 1.33 (1998/03/09; author: kato)
Make FPU_ERROR_BROKEN a new-style option.
- 1.7 (1996/10/09; author: asami)
Make sure FPU is recognized for non-Intel CPUs.
The log for rev.1.7 should have said something like:
Added FPU_ERROR_BROKEN option. This forces a successful probe for
exception 16, so that hardware with a broken FPU error signal can sort
of work.
Use the normal interrupt handler (npx_intr()) instead of a special
probe-time interrupt handler, although this causes problems due to
the bus_teardown_intr() not actually even tearing down the interrupt
(these problems were avoided by doing interrupt attachment for the
special interrupt handler directly). Fixed minor bitrot in comments.
The reason for the npxprobe()/npxprobe1() split mostly went away at
about the same time it was made (in 1992 or 1993 just before the
beginning of history). 386BSD ran all probes with interrupts completely
masked, and I didn't want to disturb this when I added an irq probe
to npxprobe(). An irq (not necessarily npx) must be acked for at least
external npx's to take the cpu out of the wait state that it enters
when an npx error occurs, so the probe must be done with a suitable
irq unmasked. npxprobe() went to great lengths to unmask precisely
the npx irq.
Running probes with all interrupts masked was never really needed in
FreeBSD, since FreeBSD always masked interrupts well enough using
splhigh(), but it wasn't until rev.1.48 (1995/12/12) of autoconf.c
that all probes were run with CPU interrupts enabled. This permits
npxprobe() to probe its irq using normal interrupt resources. Note
that most drivers still can't depend on this. It depends on the
interrupt handler being fast and the irq not being shared.
lost when the buggy code goes away completely:
- don't assume that the npx irq number is >= 8. Rev.1.73 only reversed
part of the hard-coding of it to 13 in rev.1.66.
- backed out the part of rev.1.84 that added a highly confused comment
about an enable_intr() being "highly bogus". The whole reason for
existence of npxprobe() (separate from the main probe, npxprobe1())
is to handle the complications to make this enable_intr() safe.
- backed out the part of rev.1.94 that modified npxprobe(). It mainly
broke the enable_intr() to restore_intr(). Restoring the interrupt
state in a nested way is precisely what is not wanted here. It was
harmless in practice because npxprobe() is called with interrupts
enabled, so restoring the interrupt state enables interrupts. Most
of npxprobe() is a no-op for the same reason...
code in ipl.s and icu_ipl.s that used them was removed when the
interrupt thread system was committed. Debuggers also knew about
Xresume* because these labels hide the real names of the interrupt
handlers (Xintr*), and debuggers need to special-case interrupt
handlers to get the interrupt frame.
Both gdb and ddb will now use the Xintr* and Xfastintr* symbols to
detect interrupt frames. Fast interrupt frames were never identified
correctly before, so this fixes the problem of the running stack
frame getting lost in a ddb or gdb trace generated from a fast
interrupt - e.g. when debugging a simple infinite loop in the kernel
using a serial console, the frame containing the loop would never
appear in a gdb or ddb trace.
Reviewed by: jhb, bde
The type of bus_space_tag_t is now a pointer to bus_space_tag structure,
and the bus_space_tag structure saves pointers to functions for direct
access and relocate access.
Added bsh_bam member to the bus_space_handle structure, it saves access
method either direct access or relocate access which is called by
bus_space_* functions.
Added the mecia device support. If the bs_da and bs_ra in bus tag are set
NEPC_io_space_tag and NEPC_mem_space_tag respectively, new bus_space stuff
changes the register of mecia automatically for 16bit access.
Obtained from: NetBSD/pc98
the current interrupt thread routines will guarantee the condition this is
checking for at a higher level but inthand_add() and inthand_remove() as
they currently exist don't satisfy this condition. (Which does need to be
fixed but which will take a bit more work.) This fixes shared interrupts.
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
with system statistics monitoring tools (such as systat, vmstat...)
because of stopping RTC interrupts generation.
Restore all the timers (RTC and i8254) atomically.
Reviewed by: bde
MFC after: 1 week
the process of exiting the kernel. The ast() function now loops as long
as the PS_ASTPENDING or PS_NEEDRESCHED flags are set. It returns with
preemption disabled so that any further AST's that arrive via an
interrupt will be delayed until the low-level MD code returns to user
mode.
- Use u_int's to store the tick counts for profiling purposes so that we
do not need sched_lock just to read p_sticks. This also closes a
problem where the call to addupc_task() could screw up the arithmetic
due to non-atomic reads of p_sticks.
- Axe need_proftick(), aston(), astoff(), astpending(), need_resched(),
clear_resched(), and resched_wanted() in favor of direct bit operations
on p_sflag.
- Fix up locking with sched_lock some. In addupc_intr(), use sched_lock
to ensure pr_addr and pr_ticks are updated atomically with setting
PS_OWEUPC. In ast() we clear pr_ticks atomically with clearing
PS_OWEUPC. We also do not grab the lock just to test a flag.
- Simplify the handling of Giant in ast() slightly.
Reviewed by: bde (mostly)
we are required to do if we let user processes use the extra 128 bit
registers etc.
This is the base part of the diff I got from:
http://www.issei.org/issei/FreeBSD/sse.html
I believe this is by: Mr. SUZUKI Issei <issei@issei.org>
SMP support apparently by: Takekazu KATO <kato@chino.it.okayama-u.ac.jp>
Test code by: NAKAMURA Kazushi <kaz@kobe1995.net>, see
http://kobe1995.net/~kaz/FreeBSD/SSE.en.html
I have fixed a couple of style(9) deviations. I have some followup
commits to fix a couple of non-style things.
(this commit is just the first stage). Also add various GIANT_ macros to
formalize the removal of Giant, making it easy to test in a more piecemeal
fashion. These macros will allow us to test fine-grained locks to a degree
before removing Giant, and also after, and to remove Giant in a piecemeal
fashion via sysctl's on those subsystems which the authors believe can
operate without Giant.
- move the sysctl code to kern_intr.c
- do not use INTRCNT_COUNT, but rather eintrcnt - intrcnt to determine
the length of the intrcnt array
- move the declarations of intrnames, eintrnames, intrcnt and eintrcnt
from machine-dependent include files to sys/interrupt.h
- remove the hw.nintr sysctl, it is not needed.
- fix various style bugs
Requested by: bde
Reviewed by: bde (some time ago)
simpler for npx exceptions that start as traps (no assembly required...)
and works better for npx exceptions that start as interrupts (there is
no longer a problem for nested interrupts).
Submitted by: original (pre-SMPng) version by luoqi
npxsave() went to great lengths to excecute fnsave with interrupts
enabled in case executing it froze the CPU. This case can't happen,
at least for Intel CPU/NPX's. Spurious IRQ13's don't imply spurious
freezes. Anyway, the complications were usually no-ops because IRQ13
is not used on i486's and newer CPUs, and because SMPng broke them in
rev.1.84. Forcible enabling of interrupts was changed to
write_eflags(old_eflags), but since SMPng usually calls npxsave() from
cpu_switch() with interrupts disabled, write_eflags() usually just
kept interrupts disabled.
npxinit() didn't have the usual race because it doesn't save to curpcb,
but it may have had a worse form of it since it uses the npx when it
doesn't "own" it. I'm not sure if locking prevented this. npxinit()
is normally caled with the proc lock but not sched_lock.
Use a critical region to protect pushing of curproc's npx state to
curpcb in npxexit(). Not doing so was harmless since it at worst
saved a wrong state to a dieing pcb.
handling, SMPng always switches the npx context away from curproc
before calling the handler, so the handler always paniced. When using
exception 16 exception handling, SMPng sometimes switches the npx
context away from curproc before calling the handler, so the handler
sometimes paniced. Also, we didn't lock the context while using it,
so we sometimes didn't detect the switch and then paniced in a less
controlled way.
Just lock the context while using it, and return without doing anything
except clearing the busy latch if the context is not for curproc. This
fixes the exception 16 case and makes the IRQ13 case harmless. In both
cases, the instruction that caused the exception is restarted and the
exception repeats. In the exception 16 case, we soon get an exception
that can be handled without doing anything special. In the IRQ13 case,
we get an easy to kill hung process.
other "system" header files.
Also help the deprecation of lockmgr.h by making it a sub-include of
sys/lock.h and removing sys/lockmgr.h form kernel .c files.
Sort sys/*.h includes where possible in affected files.
OK'ed by: bde (with reservations)
been made machine independent and various other adjustments have been made
to support Alpha SMP.
- It splits the per-process portions of hardclock() and statclock() off
into hardclock_process() and statclock_process() respectively. hardclock()
and statclock() call the *_process() functions for the current process so
that UP systems will run as before. For SMP systems, it is simply necessary
to ensure that all other processors execute the *_process() functions when the
main clock functions are triggered on one CPU by an interrupt. For the alpha
4100, clock interrupts are delievered in a staggered broadcast fashion, so
we simply call hardclock/statclock on the boot CPU and call the *_process()
functions on the secondaries. For x86, we call statclock and hardclock as
usual and then call forward_hardclock/statclock in the MD code to send an IPI
to cause the AP's to execute forwared_hardclock/statclock which then call the
*_process() functions.
- forward_signal() and forward_roundrobin() have been reworked to be MI and to
involve less hackery. Now the cpu doing the forward sets any flags, etc. and
sends a very simple IPI_AST to the other cpu(s). AST IPIs now just basically
return so that they can execute ast() and don't bother with setting the
astpending or needresched flags themselves. This also removes the loop in
forward_signal() as sched_lock closes the race condition that the loop worked
around.
- need_resched(), resched_wanted() and clear_resched() have been changed to take
a process to act on rather than assuming curproc so that they can be used to
implement forward_roundrobin() as described above.
- Various other SMP variables have been moved to a MI subr_smp.c and a new
header sys/smp.h declares MI SMP variables and API's. The IPI API's from
machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h.
- The globaldata_register() and globaldata_find() functions as well as the
SLIST of globaldata structures has become MI and moved into subr_smp.c.
Also, the globaldata list is only available if SMP support is compiled in.
Reviewed by: jake, peter
Looked over by: eivind
of long and int64_t; and print the result as an unsigned long. This should
make the output from the bzero() test more readable, and avoid printing a
negative bandwidth. Note that this doesn't change the decision process,
since that is based on time elapsed, not on computed bandwidth.
and used in C or vice versa. The elf compiler uses the same names
for both. Remove asnames.h with great prejudice; it has served its
purpose.
Note that this does not affect the ability to generate an aout kernel
due to gcc's -mno-underscores option.
moral support from: peter, jhb
scheduling an interrupt thread to run when needed. This has the side
effect of enabling support for entropy gathering from interrupts on
all architectures.
- Change the software interrupt and x86 and alpha hardware interrupt code
to use ithread_schedule() for most of their processing when scheduling
an interrupt to run.
- Remove the pesky Warning message about interrupt threads having entropy
enabled. I'm not sure why I put that in there in the first place.
- Add more error checking for parameters and change some cases that
returned EINVAL to panic on failure instead via KASSERT().
- Instead of doing a documented evil hack of setting the P_NOLOAD flag
on every interrupt thread whose pri was SWI_CLOCK, set the flag
explicity for clk_ithd's proc during start_softintr().
tsc_present in the right places (together with other variables of the
same linkage), and don't use messy ifdefs just to avoid exporting it in
some cases.
Some things needed bits of <i386/include/lock.h> - cy.c now has its
own (only) copy of the COM_(UN)LOCK() macros, and IMASK_(UN)LOCK()
has been moved to <i386/include/apic.h> (AKA <machine/apic.h>).
Reviewed by: jhb
mtx_enter(lock, type) becomes:
mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)
similarily, for releasing a lock, we now have:
mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.
The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.
Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:
MTX_QUIET and MTX_NOSWITCH
The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:
mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.
Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.
Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.
Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.
Finally, caught up to the interface changes in all sys code.
Contributors: jake, jhb, jasone (in no particular order)
- If possible, context switch to the thread directly in sched_ithd(),
rather than triggering a delayed ast reschedule.
- Disable interrupts while restoring fpu state in the trap handler,
in order to ensure that we are not preempted in the middle, which
could cause migration to another cpu.
Reviewed by: peter
Tested by: peter (alpha)
interrupt threads to run with it always >= 1, so that malloc can
detect M_WAITOK from "interrupt" context. This is also necessary
in order to context switch from sched_ithd() directly.
Reviewed By: peter
initialization until after malloc() is safe to call, then iterate through
all mutexes and complete their initialization.
This change is necessary in order to avoid some circular bootstrapping
dependencies.
time I tinkered around here. Since INTREN is called from the interrupt
critical path now, it should not be too expensive. In this case, we
look at the bits being changed to decide which 8 bit IO port to write to
rather than unconditionally writing to both. I could probably have gone
further and only done the write if the bits actually changed, but that
seemed overkill for the usual case in interrupt threads.
[an outb is rather expensive when it has to cross the ISA bus]
as multi-processor kernels. The old way made it difficult for kernel
modules to be portable between uni-processor and multi-processor
kernels. It is no longer necessary to jump through hoops.
- always load %fs with the private segment on entry to the kernel
- change the type of the self referntial pointer from struct privatespace
to struct globaldata
- make the globaldata symbol have value 0 in all cases, so the symbols
in globals.s are always offsets, not aliases for fields in globaldata
- define the globaldata space used for uniprocessor kernels in C, rather
than assembler
- change the assmebly language accessors to use %fs, add a macro
PCPU_ADDR(member, reg), which loads the register reg with the address
of the per-cpu variable member
variables from i386 assembly language. The syntax is PCPU(member)
where member is the capitalized name of the per-cpu variable, without
the gd_ prefix. Example: movl %eax,PCPU(CURPROC). The capitalization
is due to using the offsets generated by genassym rather than the symbols
provided by linking with globals.o. asmacros.h is the wrong place for
this but it seemed as good a place as any for now. The old implementation
in asnames.h has not been removed because it is still used to de-mangle
the symbols used by the C variables for the UP case.
Also, while here, run up to 32 interrupt sources on APIC systems.
Normalize INTREN/INTRDIS so they are the same on both UP and SMP systems
rather than sometimes a macro, and sometimes a function.
Reviewed by: jhb, jakeb
because it only takes a struct tag which makes it impossible to
use unions, typedefs etc.
Define __offsetof() in <machine/ansi.h>
Define offsetof() in terms of __offsetof() in <stddef.h> and <sys/types.h>
Remove myriad of local offsetof() definitions.
Remove includes of <stddef.h> in kernel code.
NB: Kernelcode should *never* include from /usr/include !
Make <sys/queue.h> include <machine/ansi.h> to avoid polluting the API.
Deprecate <struct.h> with a warning. The warning turns into an error on
01-12-2000 and the file gets removed entirely on 01-01-2001.
Paritials reviews by: various.
Significant brucifications by: bde
type of software interrupt. Roughly, what used to be a bit in spending
now maps to a swi thread. Each thread can have multiple handlers, just
like a hardware interrupt thread.
- Instead of using a bitmask of pending interrupts, we schedule the specific
software interrupt thread to run, so spending, NSWI, and the shandlers
array are no longer needed. We can now have an arbitrary number of
software interrupt threads. When you register a software interrupt
thread via sinthand_add(), you get back a struct intrhand that you pass
to sched_swi() when you wish to schedule your swi thread to run.
- Convert the name of 'struct intrec' to 'struct intrhand' as it is a bit
more intuitive. Also, prefix all the members of struct intrhand with
'ih_'.
- Make swi_net() a MI function since there is now no point in it being
MD.
Submitted by: cp
return through doreti to handle ast's. This is necessary for the
clock interrupts to work properly.
- Change the clock interrupts on the x86 to be fast instead of threaded.
This is needed because both hardclock() and statclock() need to run in
the context of the current process, not in a separate thread context.
- Kill the prevproc hack as it is no longer needed.
- We really need Giant when we call psignal(), but we don't want to block
during the clock interrupt. Instead, use two p_flag's in the proc struct
to mark the current process as having a pending SIGVTALRM or a SIGPROF
and let them be delivered during ast() when hardclock() has finished
running.
- Remove CLKF_BASEPRI, which was #ifdef'd out on the x86 anyways. It was
broken on the x86 if it was turned on since cpl is gone. It's only use
was to bogusly run softclock() directly during hardclock() rather than
scheduling an SWI.
- Remove the COM_LOCK simplelock and replace it with a clock_lock spin
mutex. Since the spin mutex already handles disabling/restoring
interrupts appropriately, this also lets us axe all the *_intr() fu.
- Back out the hacks in the APIC_IO x86 cpu_initclocks() code to use
temporary fast interrupts for the APIC trial.
- Add two new process flags P_ALRMPEND and P_PROFPEND to mark the pending
signals in hardclock() that are to be delivered in ast().
Submitted by: jakeb (making statclock safe in a fast interrupt)
Submitted by: cp (concept of delaying signals until ast())
- Make softinterrupts (SWI's) almost completely MI, and divorce them
completely from the x86 hardware interrupt code.
- The ihandlers array is now gone. Instead, there is a MI shandlers array
that just contains SWI handlers.
- Most of the former machine/ipl.h files have moved to a new sys/ipl.h.
- Stub out all the spl*() functions on all architectures.
Submitted by: dfr
thread for each interrupt that comes in. If we don't, log the event and
return immediately for a hardware interrupt. For a softinterrupt, panic
instead.
Submitted by: ben
newbus for referencing device interrupt handlers.
- Move the 'struct intrec' type which describes interrupt sources into
sys/interrupt.h instead of making it just be a x86 structure.
- Don't create 'ithd' and 'intrec' typedefs, instead, just use 'struct ithd'
and 'struct intrec'
- Move the code to translate new-bus interrupt flags into an interrupt thread
priority out of the x86 nexus code and into a MI ithread_priority()
function in sys/kern/kern_intr.c.
- Remove now-uneeded x86-specific headers from sys/dev/ata/ata-all.c and
sys/pci/pci_compat.c.
don't take an arg, but swi_generic() is special in order to avoid one
whole conditional branch in the old SWI dispatch code. The new SWI
dispatch code passed it a garbage arg. Bypass swi_generic() and call
swi_dispatcher() directly, like the corresponding alpha code has always
done.
The panic was rare because because it only occurred if more than one
of the {sio,cy,rc} drivers was configured and one was active, and the
cy driver doesn't even compile.
include:
* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The
alpha port is still in transition and currently uses both.)
* Per-CPU idle processes.
* Interrupts are run in their own separate kernel threads and can be
preempted (i386 only).
Partially contributed by: BSDi (BSD/OS)
Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
panicing and return a status so that we can decide whether to drop
into DDB or panic. If the status from isa_nmi is true, panic the
kernel based on machdep.panic_on_nmi, otherwise if DDB is
enabled, drop to DDB based on machdep.ddb_on_nmi.
Reviewed by: peter, phk
the PnP probe is merely a stub as we make assumptions about some of this
hardware before we have probed it.
Since these devices (with the exception of the speaker) are 'standard',
suppress output in the !bootverbose case to clean up the probe messages
somewhat.
and does not require that evil list of drivers in isa_compat.h.
It uses the same strategy that pci drivers use, namely a
COMPAT_ISA_DRIVER() macro that creates the glue on the fly.
Theoretically old-style isa drivers should be preloadable now.
operands. `movw' could be used, but instead let the assembler decide
the right instruction to use.
2. AT&T asm syntax requires a leading '*' in front of the operand for
indirect calls and jumps.
includes one of bus_at386.h and bus_pc98.h. Becuase only bus_pc98.h
supports indirect pio and bus_at386.h is identical to old bus.h, there
is no functional change in PC-AT's kernels. That is, it cannot cause
performance loss.
Submitted by: nyan
Reviewed by: imp
bde and luoqi provided useful comments for earlier version.
Make the public interface more systematically named.
Remove the alternate method, it doesn't do any good, only ruins performance.
Add counters to profile the usage of the 8 access functions.
Apply the beer-ware to my code.
The weird +/- counts are caused by two repocopies behind the scenes:
kern/kern_clock.c -> kern/kern_tc.c
sys/time.h -> sys/timetc.h
(thanks peter!)
it's options COMPAT_OLDISA and COMPAT_OLDPCI. This is meant to be a
fairly strong incentive to update the older drivers to newbus, but doesn't
(quite) leave anybody hanging with no hardware support. I was talking with
a few folks and I was encouraged to simply break or disable the shims but
that was a bit too drastic for my liking.
the low level interrupt handler number should be used. Change
setup_apic_irq_mapping() to allocate low level interrupt handler X (Xintr${X})
for any ISA interrupt X mentioned in the MP table.
Remove an assumption in the driver for the system clock (clock.c) that
interrupts mentioned in the MP table as delivered to IOAPIC #0 intpin Y
is handled by low level interrupt handler Y (Xintr${Y}) but don't assume
that low level interrupt handler 0 (Xintr0) is used.
Don't allocate two low level interrupt handlers for the system clock.
Reviewed by: NOKUBI Hirotaka <hnokubi@yyy.or.jp>
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.
is not configured. Including <isa/isavar.h> when it is not used is
harmful as well as bogus, since it includes "isa_if.h" which is not
generated when isa is not configured.
Some interface botches went away, leaving the macros unused outside of
the implementation of interrupt masking, and it was silly for the
implementation to use the macros in only one place each.
apm_default_resume() to sometimes set a very wrong time.
(1) Accesses to the RTC index and data registers were not atomic enough.
Interrupts were not masked. This was only good enough until an
interrupt handler (rtcintr()) started accessing the RTC in FreeBSD-2.0.
(2) Access to the block of time registers in inittodr() was not atomic
enough. inittodr() has 244us to read the time registers. Interrupts
were not masked. This was only good enough until something (apm)
started calling inittodr() after boot time in FreeBSD-2.0.
The fix for (2) also makes the timecounter update more atomic, although
this is currently unimportant due to the low resolution of the RTC.
Problem reported by: mckay
the countdown register.
this should not be necessary but there are broken laptops that
do not restore the countdown register on resume.
when it happnes, it messes up the hardclock interval and system clock,
which leads to the infamous "calcru: negative time" problem.
Submitted by: kjc, iwasaki
Reviewed by: Steve O'Hara-Smith <steveo@eircom.net> and committers.
Obtained from: PAO3
resource_list_release. This removes the dependancy on the
layout of ivars.
* Move set_resource, get_resource and delete_resource from
isa_if.m to bus_if.m.
* Simplify driver code by providing wrappers to those methods:
bus_set_resource(dev, type, rid, start, count);
bus_get_resource(dev, type, rid, startp, countp);
bus_get_resource_start(dev, type, rid);
bus_get_resource_count(dev, type, rid);
bus_delete_resource(dev, type, rid);
* Delete isa_get_rsrc and use bus_get_resource_start instead.
* Fix a stupid typo in isa_alloc_resource reported by Takahashi
Yoshihiro <nyan@FreeBSD.org>.
* Print a diagnostic message if we can't assign resources to a PnP
device.
* Change device_print_prettyname() so that it doesn't print
"(no driver assigned)-1" for anonymous devices.
active or not. The only sane thing we can do here is assume that if
APM is supported it might be active at some point, and bail.
In reality, even this isn't good enough; regardless of whether we support
APM or not, the system may well futz with the CPU's clock speed and throw
the TSC off. We need to stop using it for timekeeping except under
controlled circumstances. Curse the lack of a dependable high-resolution
timer.
macros) to the signal handler, for old-style BSD signal handlers as
the second (int) argument, for SA_SIGINFO signal handlers as
siginfo_t->si_code. This is source-compatible with Solaris, except
that we have no <siginfo.h> (which isn't even mentioned in POSIX
1003.1b).
An rather complete example program is at
http://www3.cons.org/cracauer/freebsd-signal.c
This will be added to the regression tests in src/.
This commit also adds code to disable the (hardware) FPU from
userconfig, so that you can use a software FP emulator on a machine
that has hardware floating point. See LINT.
the caller to specify a function to be guarded between an entry and exit
barrier, as well as pre- and post-barrier functions.
The primary use for this function is synchronised update of per-cpu private
data. The implementation is almost (but not quite) MI; with a better
mechanism for masking per-CPU interrupts it could probably be hoisted.
Reviewed by: peter (partially)
but broken, since tsc_timecounter is not initialised in that case,
and updating an uninitialised timecounter is fatal.
Fixed style bugs in the machdep.i8254_freq and machdep.tsc_freq
sysctls.
Reviewed by: phk
not masked during handling of shared PCI interrupts. This resulted in
ASTs sometimes being discarded and softclock interrupts sometimes being
handled prematurely (sometimes = quite often on systems with shared PCI
interrupts, never on other systems).
Debugged by: gibbs and other people at plutotech.com
PR: 6944, maybe 12381
1. Rise is recognized in identdcpu.c.
2. The TSC is not written to. A workaround for the CPU bug is being
applied to clock.c (the bug being that the mP6 has TSC enabled
in its CPUID-capabilities, but it only supports reading it. If we
try to write to it (MSR 16), a GPF occurs.) The new behavior is that
FreeBSD will _not_ zero the TSC. Instead, we do a bit of 64-bit
arithmetic.
Reviewed by: msmith
Obtained from: unfurl & msmith
The old version only worked right when the time was read strictly
more often than every 1/HZ seconds, but we only guarantee reading
it every (1/HZ + epsilon) seconds. Part of rev.1.126-1.127 attempted
to fix this but didn't succeed. Detect counter rollover using the
heuristic from the old version of microtime() with additional
complications for supporting calls from fast interrupt handlers.
This works provided i8254 interrupts are not delayed by more than
1/(2*HZ) seconds.
This needs more comments, and cleanups for the SMP case, and more
testing of the SMP case before it is merged into RELENG_3.
Tested by: jhay
* Re-work the resource allocation code to use helper functions in subr_bus.c.
* Add simple isa interface for manipulating the resource ranges which can be
allocated and remove the code from isa_write_ivar() which was previously
used for this purpose.
though, on systems (386 mostly) that still have a seperate fpu, but it
might be possible to find systems where the FPU coprocessor is wired to
a different IRQ pin.
instances to a parent bus.
* Define a new method BUS_ADD_CHILD which can be called from DEVICE_IDENTIFY
to add new instances.
* Add a generic implementation of DEVICE_PROBE which calls DEVICE_IDENTIFY
for each driver attached to the parent's devclass.
* Move the hint-based isa probe from the isa driver to a new isahint driver
which can be shared between i386 and alpha.
new isa drivers with sensitive flags. If the resource_find() code
is meant to "find" the wildcard sensitive flag for a driver even though
a unit is supplied, this can be simplified.
for elf kernels (it is broken for all kernels due to lack of egcs support).
Renaming of many assembler labels is avoided by declaring by declaring
the labels that need to be visible to gprof as having type "function"
and depending on the elf version of gprof being zealous about discarding
the others. A few type declarations are still missing, mainly for SMP.
PR: 9413
Submitted by: Assar Westerlund <assar@sics.se> (initial parts)
handler. This fixes pnp interrupts and would have fixed pccard interrupts
but a workaround has been applied there.
This the sound driver problems which people have reported with new-bus.
- %fs register is added to trapframe and saved/restored upon kernel entry/exit.
- Per-cpu pages are no longer mapped at the same virtual address.
- Each cpu now has a separate gdt selector table. A new segment selector
is added to point to per-cpu pages, per-cpu global variables are now
accessed through this new selector (%fs). The selectors in gdt table are
rearranged for cache line optimization.
- fask_vfork is now on as default for both UP and SMP.
- Some aio code cleanup.
Reviewed by: Alan Cox <alc@cs.rice.edu>
John Dyson <dyson@iquest.net>
Julian Elischer <julian@whistel.com>
Bruce Evans <bde@zeta.org.au>
David Greenman <dg@root.com>
conversion from short to unsigned long which is an argument of
bus_alloc_resource. Since the value -1 is used to indicate no port
reousece, id_port need to be signed (suggested by Doug Rabson and
Peter Wemm.)
Interrupts under the new scheme are managed by the i386 nexus with the
awareness of the resource manager. There is further room for optimizing
the interfaces still. All the users of register_intr()/intr_create()
should be gone, with the exception of pcic and i386/isa/clock.c.
i386 platform boots, it is no longer ISA-centric, and is fully dynamic.
Most old drivers compile and run without modification via 'compatability
shims' to enable a smoother transition. eisa, isapnp and pccard* are
not yet using the new resource manager. Once fully converted, all drivers
will be loadable, including PCI and ISA.
(Some other changes appear to have snuck in, including a port of Soren's
ATA driver to the Alpha. Soren, back this out if you need to.)
This is a checkpoint of work-in-progress, but is quite functional.
The bulk of the work was done over the last few years by Doug Rabson and
Garrett Wollman.
Approved by: core