freebsd-skq

Author	SHA1	Message	Date
jhb	b75d5a0ef9	Respect the BIOS/firmware's notion of acceptable address ranges for PCI resource allocation on x86 platforms: - Add a new helper API that Host-PCI bridge drivers can use to restrict resource allocation requests to a set of address ranges for different resource types. - For the ACPI Host-PCI bridge driver, use Producer address range resources in _CRS to enumerate valid address ranges for a given Host-PCI bridge. This can be disabled by including "hostres" in the debug.acpi.disabled tunable. - For the MPTable Host-PCI bridge driver, use entries in the extended MPTable to determine the valid address ranges for a given Host-PCI bridge. This required adding code to parse extended table entries. Similar to the new PCI-PCI bridge driver, these changes are only enabled if the NEW_PCIB kernel option is enabled (which is enabled by default on amd64 and i386). Approved by: re (kib)	2011-07-15 21:08:58 +00:00
jkim	96d6cc9832	If TSC stops ticking in C3, disable deep sleep when the user forcefully select TSC as timecounter hardware. Tested by: Fabian Keil (freebsd-listen at fabiankeil dot de)	2011-07-14 21:00:26 +00:00
jhb	83fca1d193	Move {amd64,i386}/pci/pci_bus.c and {amd64,i386}/include/pci_cfgreg.h to the x86 tree. The $PIR code is still only enabled on i386 and not amd64. While here, make the qpi(4) driver on conditional on 'device pci'.	2011-06-22 21:04:13 +00:00
jkim	6da60ac39e	Set negative quality to TSC timecounter when C3 state is enabled for Intel processors unless the invariant TSC bit of CPUID is set. Intel processors may stop incrementing TSC when DPSLP# pin is asserted, according to Intel processor manuals, i. e., TSC timecounter is useless if the processor can enter deep sleep state (C3/C4). This problem was accidentally uncovered by r222869, which increased timecounter quality of P-state invariant TSC, e.g., for Core2 Duo T5870 (Family 6, Model f) and Atom N270 (Family 6, Model 1c). Reported by: Fabian Keil (freebsd-listen at fabiankeil dot de) Ian FREISLICH (ianf at clue dot co dot za) Tested by: Fabian Keil (freebsd-listen at fabiankeil dot de) - Core2 Duo T5870 (C3 state available/enabled) jkim - Xeon X5150 (C3 state unavailable)	2011-06-22 16:40:45 +00:00
jkim	8a9fdbb838	Teach the compiler how to shift TSC value efficiently. As noted in r220631, some times compiler inserts redundant instructions to preserve unused upper 32 bits even when it is casted to a 32-bit value. Unfortunately, it seems the problem becomes more serious when it is shifted, especially on amd64.	2011-06-17 21:41:06 +00:00
jkim	9c58536b52	Tidy up r222866. - Re-add accidentally removed atomic op. for sysctl(9) handler. - Remove a period(`.') at the end of a debugging message. - Consistently spell "low" for "TSC-low" timecounter throughout. Pointed out by: bde	2011-06-08 23:44:59 +00:00
jkim	9f1a70eb73	Increase quality of TSC (or TSC-low) timecounter to 1000 if it is P-state invariant. For SMP case (TSC-low), it also has to pass SMP synchronization test and the CPU vendor/model has to be white-listed explicitly. Currently, all Intel CPUs and single-socket AMD Family 15h processors are listed here. Discussed with: hackers	2011-06-08 20:08:06 +00:00
jkim	faed140c1e	Introduce low-resolution TSC timecounter "TSC-low". It replaces the normal TSC timecounter if TSC frequency is higher than ~4.29 MHz (or 2^32-1 Hz) or multiple CPUs are present. The "TSC-low" frequency is always lower than a preset maximum value and derived from TSC frequency (by being halved until it becomes lower than the maximum). Note the maximum value for SMP case is significantly lower than UP case because we want to reduce (rare but known) "temporal anomalies" caused by non-serialized RDTSC instruction. Normally, it is still higher than "ACPI-fast" timecounter frequency (which was default timecounter hardware for long time until r222222) to be useful.	2011-06-08 19:38:31 +00:00
jkim	16bd333059	Remove a redundant assignment since r221703.	2011-06-08 18:52:42 +00:00
attilio	d7cb9e4814	MFC	2011-05-09 18:53:13 +00:00
attilio	a0b51ba62f	MFC	2011-05-06 22:45:33 +00:00
attilio	fe4de567b5	Commit the support for removing cpumask_t and replacing it directly with cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno	2011-05-05 14:39:14 +00:00
attilio	b29cc3952a	MFC	2011-05-03 18:57:46 +00:00
jkim	44b010200b	Fix build with clang. Please note there is an LLVM/Clang PR: http://llvm.org/bugs/show_bug.cgi?id=9379 Reported by: rpaulo, dim	2011-05-02 17:08:36 +00:00
jhb	3e97a80649	Add implementations of BUS_ADJUST_RESOURCE() to the PCI bus driver, generic PCI-PCI bridge driver, x86 nexus driver, and x86 Host to PCI bridge drivers.	2011-05-02 14:13:12 +00:00
jhb	08955ceac0	Change rman_manage_region() to actually honor the rm_start and rm_end constraints on the rman and reject attempts to manage a region that is out of range. - Fix various places that set rm_end incorrectly (to ~0 or ~0u instead of ~0ul). - To preserve existing behavior, change rman_init() to set rm_start and rm_end to allow managing the full range (0 to ~0ul) if they are not set by the caller when rman_init() is called.	2011-04-29 18:41:21 +00:00
jkim	3b56923bc2	Detect VMware guest and set the TSC frequency as reported by the hypervisor. VMware products virtualize TSC and it run at fixed frequency in so-called "apparent time". Although virtualized i8254 also runs in apparent time, TSC calibration always gives slightly off frequency because of the complicated timer emulation and lost-tick correction mechanism.	2011-04-29 18:20:12 +00:00
jkim	6bf8d645b9	Turn off periodic recalibration of CPU ticker frequency if it is invariant.	2011-04-28 17:56:02 +00:00
attilio	d685681d59	Add the watchdogs patting during the (shutdown time) disk syncing and disk dumping. With the option SW_WATCHDOG on, these operations are doomed to let watchdog fire, fi they take too long. I implemented the stubs this way because I really want wdog_kern_* KPI to not be dependant by SW_WATCHDOG being on (and really, the option only enables watchdog activation in hardclock) and also avoid to call them when not necessary (avoiding not-volountary watchdog activations). Sponsored by: Sandvine Incorporated Discussed with: emaste, des MFC after: 2 weeks	2011-04-28 16:02:05 +00:00
jkim	8e677ed825	Use ACPI-supplied CPU frequencies instead of estimated ones as we are about to use other values from the same table anyway. MFC after: 3 days	2011-04-27 00:32:35 +00:00
jkim	adbae7d2c9	Use newly added rdtsc32() for DELAY(9) as well.	2011-04-14 19:11:45 +00:00
jkim	38dc7c42e6	Work around an emulator problem where virtual CPU advertises TSC is P-state invariant and APERF/MPERF MSRs exist but these MSRs never tick. When we calculate effective frequency from cpu_est_clockrate(), it caused panic of division-by-zero. Now we test whether these MSRs actually increase to avoid such foot-shooting. Reported by: dim Tested by: dim	2011-04-14 17:50:26 +00:00
jkim	c55a9d790a	Use newly added rdtsc32() for the timecounter_get_t method.	2011-04-14 17:08:23 +00:00
jkim	5b73ac45d1	Add some tunable descriptions about x86 timers. Requested by: arundel	2011-04-14 00:07:08 +00:00
jkim	2092a06579	Do not use TSC for DELAY(9) if it not P-state invariant to avoid possible foot-shooting. DELAY() becomes unreliable when TSC frequency varies wildly, especially cpufreq(4) and powerd(8) are used at the same time.	2011-04-12 22:41:52 +00:00
jkim	8eb15cd79a	Probe capability to find effective frequency. When the TSC is P-state invariant, APERF/MPERF ratio can be used to find effective frequency.	2011-04-12 22:15:46 +00:00
jkim	76647eca4d	Add a new tunable 'machdep.disable_tsc_calibration' to allow skipping TSC frequency calibration. For Intel processors, if brand string from CPUID contains its nominal frequency, this frequency is used instead.	2011-04-12 21:08:34 +00:00
jkim	61582b7c03	Merge two similar functions to reduce duplication.	2011-04-11 19:27:44 +00:00
jkim	096c7a804f	Refactor DELAYDEBUG as it is only useful for correcting i8254 frequency.	2011-04-08 19:54:29 +00:00
jkim	95c723445e	Use atomic load & store for TSC frequency. It may be overkill for amd64 but safer for i386 because it can be easily over 4 GHz now. More worse, it can be easily changed by user with 'machdep.tsc_freq' tunable (directly) or cpufreq(4) (indirectly). Note it is intentionally not used in performance critical paths to avoid performance regression (but we should, in theory). Alternatively, we may add "virtual TSC" with lower frequency if maximum frequency overflows 32 bits (and ignore possible incoherency as we do now).	2011-04-07 23:28:28 +00:00
jkim	9c28b443cb	Revert r219676. Requested by: jhb, bde	2011-03-16 16:44:08 +00:00
jkim	2c6f3a8cd1	Do not let machdep.tsc_freq modify tsc_freq itself. It is bad for i386 as it does not operate atomically. Actually, it serves no purpose. Noticed by: bde	2011-03-15 19:47:20 +00:00
jkim	ad8ef5e4c7	Deprecate tsc_present as the last of its real consumers finally disappeared.	2011-03-15 17:19:52 +00:00
jkim	36e15e1609	When TSC is unavailable, broken or disabled and the current timecounter has better quality than i8254 timer, use it for DELAY(9).	2011-03-14 22:05:59 +00:00
jkim	7df55dcdeb	Add a tunable "machdep.disable_tsc" to turn off TSC. Specifically, it turns off boot-time CPU frequency calibration, DELAY(9) with TSC, and using TSC as a CPU ticker. Note tsc_present does not change by this tunable.	2011-03-11 00:44:32 +00:00
jkim	a52b39f6a4	Turn off pointless P-state invariant TSC detection based on CPU model on a virtual machine.	2011-03-10 23:06:13 +00:00
jkim	98d68ca741	Deprecate rarely used tsc_is_broken. Instead, we zero out tsc_freq because it is almost always used with tsc_freq any way.	2011-03-10 20:02:58 +00:00
jkim	a0eded8271	Set C1 "I/O then Halt" capability bit for Intel EIST. Some broken BIOSes refuse to load external SSDTs if this bit is unset for _PDC. It seems Linux and OpenSolaris did the same long ago. MFC after: 1 week	2011-02-25 23:14:24 +00:00
brucec	4a353c54fd	Fix typos - remove duplicate "is". PR: docs/154934 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days	2011-02-23 09:22:33 +00:00
jhb	5cec5b65a5	Use a dedicated taskqueue with a thread that runs at a software-interrupt priority for the periodic polling of the machine check registers.	2011-02-03 13:09:22 +00:00
mdf	6b5f615b7c	Introduce signed and unsigned version of CTLTYPE_QUAD, renaming existing uses. Rename sysctl_handle_quad() to sysctl_handle_64().	2011-01-19 23:00:25 +00:00
jhb	cfd16f7125	If an interrupt on an I/O APIC is moved to a different CPU after it has started to execute, it seems that the corresponding ISR bit in the "old" local APIC can be cleared. This causes the local APIC interrupt routine to fail to find an interrupt to service. Rather than panic'ing in this case, simply return from the interrupt without sending an EOI to the local APIC. If there are any other pending interrupts in other ISR registers, the local APIC will assert a new interrupt. Tested by: steve	2011-01-13 17:00:22 +00:00
mdf	0afa6047de	Revert to using bus_size_t for the bounce_zone's alignment member. Reuqested by: jhb	2011-01-13 00:52:57 +00:00
mdf	30a663c808	Fix a brain fart. Since this file is shared between i386 and amd64, a bus_size_t may be 32 or 64 bits. Change the bounce_zone alignment field to explicitly be 32 bits, as I can't really imagine a DMA device that needs anything close to 2GB alignment of data.	2011-01-12 21:08:49 +00:00
mdf	f6a71a40b2	sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly. Commit the kernel changes.	2011-01-12 19:54:19 +00:00
jhb	c17f46e472	Remove unneeded includes of <sys/linker_set.h>. Other headers that use it internally contain nested includes. Reviewed by: bde	2011-01-11 13:59:06 +00:00
tijl	75b3c29fb3	Copy powerpc/include/_inttypes.h to x86 and replace i386/amd64/pc98 headers with stubs. Approved by: kib (mentor)	2011-01-08 18:09:48 +00:00
jhb	bdcd5b684b	Drop the icu_lock spinlock while pausing briefly after masking the interrupt in the I/O APIC before moving it to a different CPU. If the interrupt had been triggered by the I/O APIC after locking icu_lock but before we masked the pin in the I/O APIC, then this could cause the interrupt to be pending on the "old" CPU and it would finally trigger after we had moved the interrupt to the new CPU. This could cause us to panic as there was no interrupt source associated with the old IDT vector on the old CPU. Dropping the lock after the interrupt is masked but before it is moved allows the interrupt to fire and be handled in this case before it is moved. Tested by: Daniel Braniss danny of cs huji ac il MFC after: 1 week	2010-12-23 15:17:28 +00:00
tijl	0f810ef0a2	Merge amd64 and i386 bus.h and move the resulting header to x86. Replace the original amd64 and i386 headers with stubs. Rename (AMD64\|I386)_BUS_SPACE_* to X86_BUS_SPACE_* everywhere. Reviewed by: imp (previous version), jhb Approved by: kib (mentor)	2010-12-20 16:39:43 +00:00
jhb	f6949632bc	Small style fixes: - Avoid side-effect assignments in if statements when possible. - Don't use ! to check for NULL pointers, explicitly check against NULL. - Explicitly check error return values against 0. - Don't use INTR_MPSAFE for interrupt handlers with only filters as it is meaningless. - Remove unneeded function casts.	2010-12-16 17:05:28 +00:00
jkim	3d43bf49cc	Remove AMD Family 0Fh, Model 6Bh, Stepping 2 from the list of P-state invariant CPUs. I do not believe this model is P-state invariant any more. Maybe cpufreq(4) was broken at the time of commit. :-(	2010-12-09 21:29:36 +00:00
cperciva	e7d2a75ec6	Replace i386/i386/busdma_machdep.c and amd64/amd64/busdma_machdep.c (which are identical) with a single x86/x86/busdma_machdep.c.	2010-12-09 06:41:50 +00:00
jkim	340a707cd6	Merge sys/amd64/amd64/tsc.c and sys/i386/i386/tsc.c and move to sys/x86/x86. Discussed with: avg	2010-12-08 00:09:24 +00:00
tijl	568d308167	Merge amd64/i386 _align.h by aligning on the size of register_t (copied from powerpc). Reviewed by: imp, jhb Approved by: kib (mentor)	2010-11-26 10:59:20 +00:00
avg	a09bca7c84	x86/local_apic: use newly added ARAT bit definition ARAT: APIC-Timer-always-running feature. Suggested by: mav MFC after: 12 days	2010-11-23 14:36:14 +00:00
avg	e9d45ef8b4	hwpstate: use CPU_FOREACH when binding to all available processors Also, add a comment mentioning _PSD - on some systems it's enough to put one logical CPU into a particular P-state to make other CPUs in the same domain to enter that P-state. Also, call sched_unbind() after the loop - sched_bind() automatically rebinds from previous CPU to a new one, and the new arrangement of code is safer against early loop exit. Plus one minor style nit. MFC after: 10 days	2010-11-16 12:43:45 +00:00
jkim	56b80da7ca	Move identical copies of apm_bios.h to sys/x86/include, replace them with stubs, and adjust PC98 stub accordingly. Reviewed by: imp, nyan	2010-11-11 19:36:21 +00:00
avg	286aff4b8c	make it possible to actually enable hwpstate_verbose Either via the tunable or the sysctl. MFC after: 3 days	2010-11-11 17:30:49 +00:00
jkim	d6fa755921	Make APM emulation look more closer to its origin. Use device_get_softc(9) instead of hardcoding acpi(4) unit number as we have device_t for it.	2010-11-10 18:50:12 +00:00
jkim	f01ec2a1c6	Refactor acpi_machdep.c for amd64 and i386, move APM emulation into a new file acpi_apm.c, and place it on sys/x86/acpica.	2010-11-10 01:29:56 +00:00
attilio	4963bf694d	Move the mptable.h under x86/include/. Sponsored by: Sandvine Incorporated MFC after: 14 days	2010-11-09 20:28:09 +00:00
jkim	83f0c9c278	Now OsdEnvironment.c is identical on amd64 and i386. Move it to a new home.	2010-11-09 00:27:18 +00:00
jhb	81db049683	Move the MADT parser for amd64 and i386 to sys/x86/acpica now that it is identical on both platforms.	2010-11-08 20:57:02 +00:00
jhb	bfc0fcbf5e	Sync the APIC startup sequence with amd64: - Register APIC enumerators at SI_SUB_TUNABLES - 1 instead of SI_SUB_CPU - 1. - Probe CPUs at SI_SUB_TUNABLES - 1. This allows i386 to set a truly accurate mp_maxid value rather than always setting it to MAXCPU - 1.	2010-11-08 20:35:09 +00:00
jhb	9c0fca4e23	Only dump the values of the PMC and CMCI local vector table entries on a local APIC if those LVT entries are valid. This quiets spurious illegal register local APIC errors during boot on a CPU that doesn't support those vectors. MFC after: 1 week	2010-11-08 20:03:51 +00:00
jhb	db8115405e	Cosmetic change to revert one of my earlier ones. #if __i386__ && PAE is identical to just #if PAE since PAE is only a valid option for i386. Submitted by: attilio	2010-11-02 20:16:41 +00:00
jhb	5de2d05872	Further tweaks to the ram_attach() routine: - Use > 2^32 - 1 instead of >= when checking for memory regions above 4G. - Skip SMAP entries > 4G on i386 rather than breaking out of the loop since SMAP entries are not guaranteed to be in order. - Remove 'i' and loop over 'rid' directly in the dump_avail[] case. - Only check for 4G regions in the dump_avail[] case on i386 if PAE is enabled since vm_paddr_t is 32-bit in the !PAE case. Submitted by: alc	2010-11-02 17:56:16 +00:00
jhb	3108c93ec3	Skip SMAP regions above 4GB on i386 since they will not fit into a long. While here, update some comments to better explain the new code flow. Tested by: dhw	2010-11-02 13:04:25 +00:00
jhb	e0a2a85d3a	Move <machine/apicreg.h> to <x86/apicreg.h>.	2010-11-01 18:18:46 +00:00
jhb	c7dd85142c	Move the <machine/mca.h> header to <x86/mca.h>.	2010-11-01 17:40:35 +00:00
attilio	62fe941c60	- Merge ram_attach() implementation for i386 and amd64 - Rename RES_BUS_SPACE_* into BUS_SPACE_* for consistency - Trim out an unnecessary checking condition Sponsored by: Sandvine Incorporated Requested and reviewed by: jhb	2010-10-29 18:33:43 +00:00
attilio	7ab661360c	Merge nexus.c from amd64 and i386 to x86 subtree. Sponsored by: Sandvine Incorporated Tested by: gianni	2010-10-28 16:31:39 +00:00
attilio	0237c602c6	Merge the mptable support from MD bits to x86 subtree. Sponsored by: Sandvine Incorporated Discussed with: jhb	2010-10-28 07:58:06 +00:00
attilio	f7f474e011	Style fix. Reported by: bde, dim	2010-10-26 18:01:28 +00:00
attilio	47fde01b6c	Remove usage of PRI* macro for style compliancy. Requested by: bde, jhb Sponsored by: Sandvine Incorporated	2010-10-26 16:16:15 +00:00
attilio	efd2e37632	Merge dump_machdep.c i386/amd64 under the x86 subtree. Sponsored by: Sandvine Incorporated Tested by: gianni	2010-10-26 12:46:26 +00:00
jhb	9bbda5d7a4	Use 'saveintr' instead of 'savecrit' or 'eflags' to hold the state returned by intr_disable(). Requested by: bde	2010-10-25 15:31:13 +00:00
avg	21647a4834	atrtc: remove (pre-)historic check of RTC NVRAM at address 0x0e Old scrolls tell that once upon a time IBM AT BIOS was known to put some useful system diagnostic information into RTC NVRAM. It is not really known if and for how long PC BIOSes followed that convention, but I believe that many, if not all, modern BIOSes do not do that any more (not mentioning other types of x86 firmware). Some diagnostic bits don't even make any sense any longer. The check results in confusing messages upon boot on some systems. So I am removing it. Discussed with: bde, jhb, mav MFC after: 3 weeks	2010-10-16 10:45:36 +00:00
mav	bbf7bbb468	Restore pre-r212778 optimization, skipping timer reprogramming when it is not neccessary. It allows to avoid time counter jump of up to 1/18s, when base frequency slightly tuned via machdep.i8254_freq sysctl. Fix few style things. Suggested by: bde	2010-09-18 07:36:43 +00:00
mav	0f4b390682	Add one-shot mode support to attimer (i8254) event timer. Unluckily, using one-shot mode is impossible, when same hardware used for time counting. Introduce new tunable hint.attimer.0.timecounter, setting which to 0 disables i8254 time counter and allows one-shot mode. Note, that on some systems there may be no other reliable enough time counters, so this tunable should be used with understanding.	2010-09-17 04:48:50 +00:00
mav	e0d6e8068d	Few whitespace cleanups and comments tunings. Submitted by: arundel	2010-09-16 02:59:25 +00:00
mav	eb4931dc6c	Refactor timer management code with priority to one-shot operation mode. The main goal of this is to generate timer interrupts only when there is some work to do. When CPU is busy interrupts are generating at full rate of hz + stathz to fullfill scheduler and timekeeping requirements. But when CPU is idle, only minimum set of interrupts (down to 8 interrupts per second per CPU now), needed to handle scheduled callouts is executed. This allows significantly increase idle CPU sleep time, increasing effect of static power-saving technologies. Also it should reduce host CPU load on virtualized systems, when guest system is idle. There is set of tunables, also available as writable sysctls, allowing to control wanted event timer subsystem behavior: kern.eventtimer.timer - allows to choose event timer hardware to use. On x86 there is up to 4 different kinds of timers. Depending on whether chosen timer is per-CPU, behavior of other options slightly differs. kern.eventtimer.periodic - allows to choose periodic and one-shot operation mode. In periodic mode, current timer hardware taken as the only source of time for time events. This mode is quite alike to previous kernel behavior. One-shot mode instead uses currently selected time counter hardware to schedule all needed events one by one and program timer to generate interrupt exactly in specified time. Default value depends of chosen timer capabilities, but one-shot mode is preferred, until other is forced by user or hardware. kern.eventtimer.singlemul - in periodic mode specifies how much times higher timer frequency should be, to not strictly alias hardclock() and statclock() events. Default values are 2 and 4, but could be reduced to 1 if extra interrupts are unwanted. kern.eventtimer.idletick - makes each CPU to receive every timer interrupt independently of whether they busy or not. By default this options is disabled. If chosen timer is per-CPU and runs in periodic mode, this option has no effect - all interrupts are generating. As soon as this patch modifies cpu_idle() on some platforms, I have also refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions (if supported) under high sleep/wakeup rate, as fast alternative to other methods. It allows SMP scheduler to wake up sleeping CPUs much faster without using IPI, significantly increasing performance on some highly task-switching loads. Tested by: many (on i386, amd64, sparc64 and powerc) H/W donated by: Gheorghe Ardelean Sponsored by: iXsystems, Inc.	2010-09-13 07:25:35 +00:00
jhb	b33adc1bec	Each processor socket in a QPI system has a special PCI bus for the "uncore" devices (such as the memory controller) in that socket. Stop hardcoding support for two busses, but instead start probing buses at domain 0, bus 255 and walk down until a bus probe fails. Also, do not probe a bus if it has already been enumerated elsewhere (e.g. if ACPI ever enumerates these buses in the future).	2010-09-07 13:50:02 +00:00
rpaulo	3cf9c58268	When DTrace is enabled, make sure we don't overwrite the IDT_DTRACE_RET entry with an IRQ for some hardware component. Reviewed by: jhb Sponsored by: The FreeBSD Foundation	2010-08-30 18:12:21 +00:00
jhb	07e59b2937	Correctly ensure that the CPU family is 0x6, not non-zero. Submitted by: Dimitry Andric	2010-08-25 20:37:58 +00:00
jhb	4db720c0be	Intel QPI chipsets actually provide two extra "non-core" PCI buses that provide PCI devices for various hardware such as memory controllers, etc. These PCI buses are not enumerated via ACPI however. Add qpi(4) psuedo bus and Host-PCI bridge drivers to enumerate these buses. Currently the driver uses the CPU ID to determine the bridges' presence. In collaboration with: Joseph Golio @ Isilon Systems MFC after: 2 weeks	2010-08-25 19:12:05 +00:00
mav	7d99c51342	Enable timer interrupt before starting timer. This allows to handle very short periods without interrupt loss.	2010-08-24 16:08:01 +00:00
jhb	01db506530	When performing a sanity check on the SRAT table to ensure that each memory domain has an assigned CPU, ignore disabled CPUs. Previously disabled CPUs were counted as being in domain 0. Reported by: mdf	2010-07-29 17:37:35 +00:00
jhb	1e88e37ddc	The corrected error count field is dependent on CMCI, not TES. MFC after: 1 week	2010-07-28 21:52:09 +00:00
jhb	9595a597c5	Add a parser for the ACPI SRAT table for amd64 and i386. It sets PCPU(domain) for each CPU and populates a mem_affinity array suitable for the NUMA support in the physical memory allocator. Reviewed by: alc	2010-07-27 20:40:46 +00:00
mav	cabdb41848	Increment td->td_intr_nesting_level for LAPIC timer interrupts. Among other things it hints SCHED_ULE to run clock swi handlers on their native CPUs, avoiding many unneeded IPI_PREEMPT calls.	2010-07-24 10:49:59 +00:00
mav	0ea74c96a2	Fix several un-/signedness bugs of r210290 and r210293. Add one more check.	2010-07-20 15:48:29 +00:00
mav	1021ed9c1f	Extend timer driver API to report also minimal and maximal supported period lengths. Make MI wrapper code to validate periods in request. Make kernel clock management code to honor these hardware limitations while choosing hz, stathz and profhz values.	2010-07-20 10:58:56 +00:00
mav	b8b00841c9	Move timeevents.c to MI code, as it is not x86-specific. I already have it working on Marvell ARM SoCs, and it would be nice to unify timer code between more platforms.	2010-07-14 13:31:27 +00:00
mav	3eb85b7ef3	Remove some unneeded includes. Code now can be built on ARM.	2010-07-14 10:49:14 +00:00
mav	b076092fdd	Rise knowledge about curthread->td_intr_frame by one step. Make timer callback argument really opaque. Not repeat interrupt handler's problem in case somebody will ever need to have both argument and frame.	2010-07-13 12:46:06 +00:00
mav	f723107552	Unify pc98 event timer code with the rest of x86. Reviewed by: nyan@	2010-07-13 06:57:27 +00:00
mav	55884c8757	Instead of deleting existing IRQ resource, which is not really working for ACPI bus, find wanted IRQ rid or spare one. This should fix panic during boot on systems reporting fancy IRQ numbers for attimer and atrtc.	2010-07-12 06:46:17 +00:00
mav	5036d2a373	Make kernel panic with reasonable message if no usable event timer found.	2010-07-11 17:08:37 +00:00
mav	234db8607d	Allow attimer to be hinted at ISA if not reported by ISA PNP or ACPI. Rephrase respective atrtc code same way to be more readable.	2010-07-01 18:59:05 +00:00
mav	0966180de4	Rework r209456: Instead of using fake rid (which ISA doesn't like), delete untrusted IRQ resource and let it be recreated.	2010-07-01 18:51:18 +00:00
mav	70e4b2fc33	Do not trust IRQ reported by ACPI. There are cases when it is wrong.	2010-06-23 05:43:21 +00:00
mav	53ba2d6cf3	Add "legacy route" support to HPET driver. When enabled, this mode makes HPET to steal IRQ0 from i8254 and IRQ8 from RTC timers. It can be suitable for HPETs without FSB interrupts support, as it gives them two unshared IRQs. It allows them to provide one per-CPU event timer on dual-CPU system, that should be suitable for further tickless kernels. To enable it, such lines may be added to /boot/loader.conf: hint.atrtc.0.clock=0 hint.attimer.0.clock=0 hint.hpet.0.legacy_route=1	2010-06-22 19:42:27 +00:00
mav	c7c1ff1999	Fix i386 LINT build broken by r209371. There appeared such legacy thing as APM, that somehow breaking RTC.	2010-06-21 19:53:47 +00:00
mav	d1175426d7	Implement new event timers infrastructure. It provides unified APIs for writing event timer drivers, for choosing best possible drivers by machine independent code and for operating them to supply kernel with hardclock(), statclock() and profclock() events in unified fashion on various hardware. Infrastructure provides support for both per-CPU (independent for every CPU core) and global timers in periodic and one-shot modes. MI management code at this moment uses only periodic mode, but one-shot mode use planned for later, as part of tickless kernel project. For this moment infrastructure used on i386 and amd64 architectures. Other archs are welcome to follow, while their current operation should not be affected. This patch updates existing drivers (i8254, RTC and LAPIC) for the new order, and adds event timers support into the HPET driver. These drivers have different capabilities: LAPIC - per-CPU timer, supports periodic and one-shot operation, may freeze in C3 state, calibrated on first use, so may be not exactly precise. HPET - depending on hardware can work as per-CPU or global, supports periodic and one-shot operation, usually provides several event timers. i8254 - global, limited to periodic mode, because same hardware used also as time counter. RTC - global, supports only periodic mode, set of frequencies in Hz limited by powers of 2. Depending on hardware capabilities, drivers preferred in following orders, either LAPIC, HPETs, i8254, RTC or HPETs, LAPIC, i8254, RTC. User may explicitly specify wanted timers via loader tunables or sysctls: kern.eventtimer.timer1 and kern.eventtimer.timer2. If requested driver is unavailable or unoperational, system will try to replace it. If no more timers available or "NONE" specified for second, system will operate using only one timer, multiplying it's frequency by few times and uing respective dividers to honor hz, stathz and profhz values, set during initial setup.	2010-06-20 21:33:29 +00:00
mav	3e17abd3a9	Core i5, same as previously Core2Duo, found to not set P-state for single core lower then set on other cores. Do not try to test P-states on attach on SMP systems. It is hopeless now and will just pollute verbose logs. If needed, check still can be forced via loader tunable.	2010-06-19 13:09:42 +00:00
jhb	3e2692fd42	Restore the machine check register banks on resume. For banks being monitored via CMCI, reset the interrupt threshold to 1 on resume. Reviewed by: jkim MFC after: 2 weeks	2010-06-15 18:51:41 +00:00
mav	ea954fa396	Virtualize pci_remap_msi_irq() call from general MSI code. It allows MSI (FSB interrupts) to be used by non-PCI devices, such as HPET.	2010-06-14 07:10:37 +00:00
jhb	9b74a62d73	Update several places that iterate over CPUs to use CPU_FOREACH().	2010-06-11 18:46:34 +00:00
mav	6aaeccb81b	Do not disable edge-triggered interrupts before migration. DELAY() with interrupt disabled highly probable causes interrupt loss.	2010-06-10 17:04:01 +00:00
jhb	922c53a83d	Move the MD support for PCI message signalled interrupts to the x86 tree as it is identical for i386 and amd64.	2010-06-08 18:36:03 +00:00
jhb	7e0b91d988	Move the machine check support code to the x86 tree since it is identical on i386 and amd64. Requested by: alc	2010-06-08 18:04:07 +00:00
jhb	9a18b596b8	Move the I/O APIC code to the x86 tree since it is identical on i386 and amd64.	2010-06-08 17:51:21 +00:00
jhb	9e6f9b1e86	Add support for corrected machine check interrupts. CMCI is a new local APIC interrupt that fires when a threshold of corrected machine check events is reached. CMCI also includes a count of events when reporting corrected errors in the bank's status register. Note that individual banks may or may not support CMCI. If they do, each bank includes its own threshold register that determines when the interrupt fires. Currently the code uses a very simple strategy where it doubles the threshold on each interrupt until it succeeds in throttling the interrupt to occur only once a minute (this interval can be tuned via sysctl). The threshold is also adjusted on each hourly poll which will lower the threshold once events stop occurring. Tested by: Sailaja Bangaru sbappana at yahoo com MFC after: 1 month	2010-05-24 15:45:05 +00:00
mav	48198e3ddd	- Implement MI helper functions, dividing one or two timer interrupts with arbitrary frequencies into hardclock(), statclock() and profclock() calls. Same code with minor variations duplicated several times over the tree for different timer drivers and architectures. - Switch all x86 archs to new functions, simplifying the code and removing extra logic from timer drivers. Other archs are also welcome.	2010-05-24 11:40:49 +00:00
mav	045332b709	Restore different APIC init orders for i386 and amd64 unified in r208452. Seems noone of them contents both arch for different reasons. Submitted by: kib@	2010-05-24 01:49:00 +00:00
mav	2b2103d8ae	Unify local_apic.c for x86 archs,	2010-05-23 17:45:01 +00:00
rpaulo	dc4a42cb4b	Fix another instance of lapic_cyclic_clock_func.	2010-04-20 21:04:57 +00:00
attilio	58a84d7424	Default the machdep.lapic_allclocks to be enabled in order to cope with broken atrtc. Now if you want more correct stats on profhz and stathz it may be disabled by setting to 0. Reported by: A. Akephalos <akephalos dot akephalos at gmail dot com>, Jakub Lach <jakub_lach at mailplus dot pl> MFC: 1 week	2010-04-09 14:22:09 +00:00
attilio	5de8477431	Improving the clocks auto-tunning by firstly checking if the atrtc may be correctly initialized and just then assign to softclock/profclock. Right now, some atrtc seems reporting strange diagnostic error* making the current pattern bogus. In order to do that cleanly, lapic_setup_clock(), on both ia32 and amd64, now accepts as arguments the desired sources to handle, and returns the actual ones (LAPIC_CLOCK_NONE is forbidden because otherwise there is no meaning in calling such function). This allows to bring out into commont x86 code the handling part for machdep.lapic_allclocks tunable, which is retained. Sponsored by: Sandvine Incorporated Tested by: yongari, Richard Todd <rmtodd at ichotolot dot servalan dot com> MFC: 3 weeks X-MFC: r202387, 204309	2010-03-03 17:13:29 +00:00
attilio	1b75a98556	Introduce the new kernel sub-tree x86 which should contain all the code shared and generalized between our current amd64, i386 and pc98. This is just an initial step that should lead to a more complete effort. For the moment, a very simple porting of cpufreq modules, BIOS calls and the whole MD specific ISA bus part is added to the sub-tree but ideally a lot of code might be added and more shared support should grow. Sponsored by: Sandvine Incorporated Reviewed by: emaste, kib, jhb, imp Discussed on: arch MFC: 3 weeks	2010-02-25 14:13:39 +00:00

... 3 4 5 6 7

321 Commits