freebsd-skq

Author	SHA1	Message	Date
John Baldwin	4ac60df584	Add a new 'pmap_invalidate_cache()' to flush the CPU caches via the wbinvd() instruction. This includes a new IPI so that all CPU caches on all CPUs are flushed for the SMP case. MFC after: 1 month	2006-05-01 21:36:47 +00:00
John Baldwin	215e7c161a	Rework how we wire up interrupt sources to CPUs: - Throw out all of the logical APIC ID stuff. The Intel docs are somewhat ambiguous, but it seems that the "flat" cluster model we are currently using is only supported on Pentium and P6 family CPUs. The other "hierarchy" cluster model that is supported on all Intel CPUs with local APICs is severely underdocumented. For example, it's not clear if the OS needs to glean the topology of the APIC hierarchy from somewhere (neither ACPI nor MP Table include it) and setup the logical clusters based on the physical hierarchy or not. Not only that, but on certain Intel chipsets, even though there were 4 CPUs in a logical cluster, all the interrupts were only sent to one CPU anyway. - We now bind interrupts to individual CPUs using physical addressing via the local APIC IDs. This code has also moved out of the ioapic PIC driver and into the common interrupt source code so that it can be shared with MSI interrupt sources since MSI is addressed to APICs the same way that I/O APIC pins are. - Interrupt source classes grow a new method pic_assign_cpu() to bind an interrupt source to a specific local APIC ID. - The SMP code now tells the interrupt code which CPUs are avaiable to handle interrupts in a simpler and more intuitive manner. For one thing, it means we could now choose to not route interrupts to HT cores if we wanted to (this code is currently in place in fact, but under an #if 0 for now). - For now we simply do static round-robin of IRQs to CPUs when the first interrupt handler just as before, with the change that IRQs are now bound to individual CPUs rather than groups of up to 4 CPUs. - Because the IRQ to CPU mapping has now been moved up a layer, it would be easier to manage this mapping from higher levels. For example, we could allow drivers to specify a CPU affinity map for their interrupts, or we could allow a userland tool to bind IRQs to specific CPUs. The MFC is tentative, but I want to see if this fixes problems some folks had with UP APIC kernels on 6.0 on SMP machines (an SMP kernel would work fine, but a UP APIC kernel (such as GENERIC in RELENG_6) would lose interrupts). MFC after: 1 week	2006-02-28 22:24:55 +00:00
John Baldwin	b439e431bf	Tweak how the MD code calls the fooclock() methods some. Instead of passing a pointer to an opaque clockframe structure and requiring the MD code to supply CLKF_FOO() macros to extract needed values out of the opaque structure, just pass the needed values directly. In practice this means passing the pair (usermode, pc) to hardclock() and profclock() and passing the boolean (usermode) to hardclock_cpu() and hardclock_process(). Other details: - Axe clockframe and CLKF_FOO() macros on all architectures. Basically, all the archs were taking a trapframe and converting it into a clockframe one way or another. Now they can just extract the PC and usermode values directly out of the trapframe and pass it to fooclock(). - Renamed hardclock_process() to hardclock_cpu() as the latter is more accurate. - On Alpha, we now run profclock() at hz (profhz == hz) rather than at the slower stathz. - On Alpha, for the TurboLaser machines that don't have an 8254 timecounter, call hardclock() directly. This removes an extra conditional check from every clock interrupt on Alpha on the BSP. There is probably room for even further pruning here by changing Alpha to use the simplified timecounter we use on x86 with the lapic timer since we don't get interrupts from the 8254 on Alpha anyway. - On x86, clkintr() shouldn't ever be called now unless using_lapic_timer is false, so add a KASSERT() to that affect and remove a condition to slightly optimize the non-lapic case. - Change prototypeof arm_handler_execute() so that it's first arg is a trapframe pointer rather than a void pointer for clarity. - Use KCOUNT macro in profclock() to lookup the kernel profiling bucket. Tested on: alpha, amd64, arm, i386, ia64, sparc64 Reviewed by: bde (mostly)	2005-12-22 22:16:09 +00:00
John Baldwin	2dce95a085	Change the i386 code to pass the interrupt vector as a separate argument rather than embedding it in the intrframe as if_vec. This reduces diffs with amd64 somewhat. - Remove cf_vec from clockframe (it wasn't used anyway) and stop pushing dummy vector arguments for ipi_bitmap_handler() and lapic_handle_timer() since clockframe == trapframe now. - Fix ddb to handle stack traces across interrupt entry points that just have a trapframe on their stack and not a trapframe + vector. - Change intr_execute_handlers() to take a trapframe rather than an intrframe pointer. - Change lapic_handle_intr() and atpic_handle_intr() to take a vector and trapframe rather than an intrframe. - GC struct intrframe now that nothing uses it anymore. - GC CLOCK_TO_TRAPFRAME() and INTR_TO_TRAPFRAME(). Reviewed by: bde Requested by: peter	2005-12-05 22:39:09 +00:00
John Baldwin	c7362ff7fb	Change the x86 code to allocate IDT vectors on-demand when an interrupt source is first enabled similar to how intr_event's now allocate ithreads on-demand. Previously, we would map IDT vectors 1:1 to IRQs. Since we only have 191 available IDT vectors for I/O interrupts, this limited us to only supporting IRQs 0-190 corresponding to the first 190 I/O APIC intpins. On many machines, however, each PCI-X bus has its own APIC even though it only has 1 or 2 devices, thus, we were reserving between 24 and 32 IRQs just for 1 or 2 devices and thus 24 or 32 IDT vectors. With this change, a machine with 100 IRQs but only 5 in use will only use up 5 IDT vectors. Also, this change provides an API (apic_alloc_vector() and apic_free_vector()) that will allow a future MSI interrupt source driver to request IDT vectors for use by MSI interrupts on x86 machines. Tested on: amd64, i386	2005-11-02 20:11:47 +00:00
Stephan Uphoff	6097174e4d	Add IPI support for preempting a thread on another CPU. MFC after: 3 weeks	2005-06-09 18:23:54 +00:00
John Baldwin	2326e092a7	Remove support for mixed mode altogether now that we no longer use IRQ 0 when using an APIC. This simplifies the APIC code somewhat and also allows us to be pedantically more compliant with ACPI which mandates no use of mixed mode.	2005-04-14 17:59:58 +00:00
John Baldwin	e8ce55117b	Use the local APIC timer to drive the various kernel clocks on SMP machines rather than forwarding interrupts from the clock devices around using IPIs: - Add an IDT vector that pushes a clock frame and calls lapic_handle_timer(). - Add functions to program the local APIC timer including setting the divisor, and setting up the timer to either down a periodic countdown or one-shot countdown. - Add a lapic_setup_clock() function that the BSP calls from cpu_init_clocks() to setup the local APIC timer if it is going to be used. The setup uses a one-shot countdown to calibrate the timer. We then program the timer on each CPU to fire at a frequency of hz * 3. stathz is defined as freq / 23 (hz * 3 / 23), and profhz is defined as freq / 2 (hz * 3 / 2). This gives the clocks relatively prime divisors while keeping a low LCM for the frequency of the clock interrupts. Thanks to Peter Jeremy for suggesting this approach. - Remove the hardclock and statclock forwarding code including the two associated IPIs. The bitmap IPI handler has now effectively degenerated to just IPI_AST. - When the local APIC timer is used we don't turn the RTC on at all, but we still enable interrupts on the ISA timer 0 (i8254) for timecounting purposes.	2005-02-08 20:25:07 +00:00
John Baldwin	dfa7bc486b	- Add a function to set the Task Priority Register (TPR) of the local APIC. Currently this is only used to initiailize the TPR to 0 during initial setup. - Reallocate vectors for the local APIC timer, error, and thermal LVT entries. The timer entry is allocated from the top of the I/O interrupt range reducing the number of vectors available for hardware interrupts to 191. Linux happens to use the same exact vector for its timer interrupt as well. If the timer vector shared the same priority queue as the IPI handlers, then the frequency that the timer vector will eventually be firing at can interact badly with the IPIs resulting in the queue filling and the dreaded IPI stuck panics, hence it being located at the top of the previous priority queue instead. - Fixup various minor nits in comments.	2004-12-23 19:47:59 +00:00
Stephan Uphoff	f30a4a1ced	Avoid more than two pending IPI interrupt vectors per local APIC as this may cause deadlocks. This should fix kern/72123. Discussed with: jhb Tested by: Nik Azim Azam, Andy Farkas, Flack Man, Aykut KARA Izzet BESKARDES, Jens Binnewies, Karl Keusgen Approved by: sam (mentor)	2004-12-07 20:15:01 +00:00
John Baldwin	96d3b93753	Various cleanups in support of a future ioapic_config_intr() function: - Allow ioapic_set_{nmi,smi,extint}() to be called multiple times on the same pin so long as the pin's mode is the same as the mode being requested. - Add a notion of bus type for the interrupt associated with interrupt pin. This is needed so that we can force all EISA interrupts to be active high in the forthcoming ioapic_config_intr(). - Fix a bug for EISA systems that didn't remap IRQs. This would have broken EISA systems that tried to disable mixed mode for IRQ 0.	2004-06-23 15:29:20 +00:00
John Baldwin	eb8943b13e	Rework the APIC mixed mode support a bit: - Require the APIC enumerators to explicitly enable mixed mode by calling ioapic_enable_mixed_mode(). Calling this function tells the apic driver that the PC-AT 8259A PICs are present and routable through the first I/O APIC via an ExtINT pin. The mptable enumerator always calls this function for now. The MADT enumerator only enables mixed mode if the PC-AT compatability flag is set in the MADT header. - Allow mixed mode to be enabled or disabled via a 'hw.apic.mixed_mode' tunable. By default this tunable is set to 1 (true). The kernel option NO_MIXED_MODE changes the default to 0 to preserve existing behavior, but adding 'hw.apic.mixed_mode=0' to loader.conf achieves the same effect. - Only use mixed mode to route IRQ 0 if it is both enabled by the APIC enumerator and activated by the loader tunable. Note that both conditions must be true, so if the APIC enumerator does not enable mixed mode, then you can't set the tunable to try to override the enumerator.	2004-05-10 18:49:58 +00:00
John Baldwin	c2ce35977e	- Change the APIC code to mostly use the recently added intr_trigger and intr_polarity enums for passing around interrupt trigger modes and polarity rather than using the magic numbers 0 for level/low and 1 for edge/high. - Convert the mptable parsing code to use the new ELCR wrapper code rather than reading the ELCR directly. Also, use the ELCR settings to control both the trigger and polarity of EISA IRQs instead of just the trigger mode. - Rework the MADT's handling of the ACPI SCI again: - If no override entry for the SCI exists at all, use level/low trigger instead of the default edge/high used for ISA IRQs. - For the ACPI SCI, use level/low values for conforming trigger and polarity rather than the edge/high values we use for all other ISA IRQs. - Rework the tunables available to override the MADT. The hw.acpi.force_sci_lo tunable is no longer supported. Instead, there are now two tunables that can independently override the trigger mode and/or polarity of the SCI. The hw.acpi.sci.trigger tunable can be set to either "edge" or "level", and the hw.acpi.sci.polarity tunable can be set to either "high" or "low". To simulate hw.acpi.force_sci_lo, set hw.acpi.sci.trigger to "level" and hw.acpi.sci.polarity to "low". If you are having problems with ACPI either causing an interrupt storm or not working at all (e.g., the power button doesn't turn invoke a shutdown -p now), you can try tweaking these two tunables to find the combination that works.	2004-05-04 20:39:24 +00:00
Peter Wemm	40e3826a9f	Whitespace nit (sorry, couldn't help it)	2003-11-14 22:21:30 +00:00
John Baldwin	3ab2ba59f4	Shuffle the APIC interrupt vectors around a bit: - Move the IPI and local APIC interrupt vectors up into the 0xf0 - 0xff range. The pmap lazyfix IPI was reordered down next to the TLB shootdowns to avoid conflicting with the spurious interrupt vector. - Move the base of APIC interrupts up 16 so that the first 16 APIC interrupts do not overlap the vectors used by the ATPIC. - Remove bogus interrupt vector reservations for LINT[01]. - Now that 0xc0 - 0xef are available, use them for device interrupts. This increases the number of APIC device interrupts to 191. - Increase the system-wide number of global interrupts to 191 to catch up to more APIC interrupts. Requested by: peter (2)	2003-11-14 19:10:13 +00:00
John Baldwin	69487322d8	Fix a typo.	2003-11-13 16:41:07 +00:00
John Baldwin	bd9cd7e3f7	- Move manipulation of td_intr_nesting_level out of assembly interrupt vector stubs and into the C functions they call. - Move disabling and EOIing of interrupt sources out of PIC driver entry points and into intr_execute_handlers(). Intr_execute_handlers() only disables a source for an interrupt if it is a stray interrupt or has threaded handlers. Sources with fast handlers no longer disable (mask) the source while executing the handlers. - Move the setting of clkintr_pending into intr_execute_handlers() and set the variable for any interrupt source with a vector of 0. (Should only be true for IRQ 0.) This fixes clkintr_pending in the NO_MIXED_MODE case. - Implement lapic_eoi() and use it to implement ioapic_eoi_source(). - Rename atpic_sched_ithd() to atpic_handle_intr() since it is used to handle all atpic interrupts and not just threaded ones. Inspired by: peter's changes to amd64 in p4 (1) Requested by: bde (2)	2003-11-12 18:13:57 +00:00
John Baldwin	6f92bdd0c1	New APIC support code: - The apic interrupt entry points have been rewritten so that each entry point can serve 32 different vectors. When the entry is executed, it uses one of the 32-bit ISR registers to determine which vector in its assigned range was triggered. Thus, the apic code can support 159 different interrupt vectors with only 5 entry points. - We now always to disable the local APIC to work around an errata in certain PPros and then re-enable it again if we decide to use the APICs to route interrupts. - We no longer map IO APICs or local APICs using special page table entries. Instead, we just use pmap_mapdev(). We also no longer export the virtual address of the local APIC as a global symbol to the rest of the system, but only in local_apic.c. To aid this, the APIC ID of each CPU is exported as a per-CPU variable. - Interrupt sources are provided for each intpin on each IO APIC. Currently, each source is given a unique interrupt vector meaning that PCI interrupts are not shared on most machines with an I/O APIC. That mapping for interrupt sources to interrupt vectors is up to the APIC enumerator driver however. - We no longer probe to see if we need to use mixed mode to route IRQ 0, instead we always use mixed mode to route IRQ 0 for now. This can be disabled via the 'NO_MIXED_MODE' kernel option. - The npx(4) driver now always probes to see if a built-in FPU is present since this test can now be performed with the new APIC code. However, an SMP kernel will panic if there is more than one CPU and a built-in FPU is not found. - PCI interrupts are now properly routed when using APICs to route interrupts, so remove the hack to psuedo-route interrupts when the intpin register was read. - The apic.h header was moved to apicreg.h and a new apicvar.h header that declares the APIs used by the new APIC code was added.	2003-11-03 21:53:38 +00:00

18 Commits