freebsd-nq

Author	SHA1	Message	Date
David E. O'Brien	8bed40c9fe	Consitently use "__LP64__". [there are 33 __LP64__'s in the kernel (minus cddl/ and contrib/), and 11 _LP64's]	2012-05-24 21:44:46 +00:00
John Baldwin	da65bface2	Don't expose i386-only ptrace constants on amd64. This broke gdb with libthread_db on amd64. Reported by: avg	2012-05-17 20:21:55 +00:00
Attilio Rao	b8be27bf29	Revert part of r234723 by re-enabling the SMP protection for intr_bind() on x86. This has been requested by jhb and I strongly disagree with this, but as long as he is the x86 and interrupt subsystem maintainer I will follow his directives. The disagreement cames from what we should really consider as a public KPI. IMHO, if we really need a selection between the kernel functions, we may need an explicit protection like _KERNEL_KPI, which defines which subset of the kernel function might really be considered as part of the KPI (for thirdy part modules) and which not. As long as we don't have this mechanism I just consider any possible function as usable by thirdy part code, thus intr_bind() included. MFC after: 1 week	2012-05-03 21:44:01 +00:00
Attilio Rao	70dbd1604c	Clean up the intr* MD KPI from the SMP dependency, removing a cause of discrepancy between modules and kernel, but deal with SMP differences within the functions themselves. As an added bonus this also helps in terms of code readability. Requested by: gibbs Reviewed by: jhb, marius MFC after: 1 week	2012-04-26 20:24:25 +00:00
Peter Grehan	26b1d645e0	Add x2apic MSR definitions Reviewed by: jhb Obtained from: bhyve via Neel via NetApp	2012-04-17 00:54:38 +00:00
John Baldwin	45b516f642	Trim stray blank line.	2012-04-11 21:00:33 +00:00
John Baldwin	bcd6068179	Recognize the RDRAND instruction feature. Submitted by: Michael Fuckner michael fuckner net MFC after: 3 days	2012-04-09 15:20:16 +00:00
Justin T. Gibbs	47c77b2265	Fix interrupt load balancing regression, introduced in revision 222813, that left all un-pinned interrupts assigned to CPU 0. sys/x86/x86/intr_machdep.c: In intr_shuffle_irqs(), remove CPU_SETOF() call that initialized the "intr_cpus" cpuset to only contain CPU0. This initialization is too late and nullifies the results of calls the intr_add_cpu() that occur much earlier in the boot process. Since "intr_cpus" is statically initialized to the empty set, and all processors, including the BSP, already add themselves to "intr_cpus" no special initialization for the BSP is necessary. MFC after: 3 days	2012-04-06 21:19:28 +00:00
John Baldwin	b867b16dc9	Further tweak the changes made in r233709. The kernel doesn't permit sleeping from a swi handler (even though in this case it would be ok), so switch the refill and scanning SWI handlers to being tasks on a fast taskqueue. Also, only schedule the refill task for a CMCI as an MC# can fire at any time, so it should do the minimal amount of work needed and avoid opportunities to deadlock before it panics (such as scheduling a task it won't ever need in practice). To handle the case of an MC# only finding recoverable errors (which should never happen), always try to refill the event free list when the periodic scan executes. MFC after: 2 weeks	2012-04-02 17:26:21 +00:00
John Baldwin	f2e3bfc074	Make machine check exception logging more readable. On newer Intel systems, an uncorrected ECC error tends to fire on all CPUs in a package simultaneously and the current printf hacks are not sufficient to make the messages legible. Instead, use the existing mca_lock spinlock to serialize calls to mca_log() and change the machine check code to panic directly when an unrecoverable error is encoutered rather than falling back to a trap_fatal() call in trap() (which adds nearly a screen-full of logging messages that aren't useful for machine checks). MFC after: 2 weeks	2012-04-02 15:07:22 +00:00
John Baldwin	8b9e9831bf	Attempt to make machine check handling a bit more robust: - Don't malloc() new MCA records for machine checks logged due to a CMCI or MC# exception. Instead, use a pre-allocated pool of records. When a CMCI or MC# exception fires, schedule a swi to refill the pool. The pool is sized to hold at least one record per available machine bank, and one record per CPU. This should handle the case of all CPUs triggering a single bank at once as well as the case a single CPU triggering all of its banks. The periodic scans still use malloc() since they are run from a safe context. - Since we have to create an swi to handle refills, make the periodic scan a second swi for the same thread instead of having a separate taskqueue thread for the scans. Suggested by: mdf (avoiding malloc()) MFC after: 2 weeks	2012-03-30 20:17:39 +00:00
John Baldwin	435803f3c7	Move the legacy(4) driver to x86.	2012-03-30 19:10:14 +00:00
Dimitry Andric	a80f8859c4	Fix an issue introduced in sys/x86/include/endian.h with r232721. In that revision, the bswapXX_const() macros were renamed to bswapXX_gen(). Also, bswap64_gen() was implemented as two calls to bswap32(), and similarly, bswap32_gen() as two calls to bswap16(). This mainly helps our base gcc to produce more efficient assembly. However, the arguments are not properly masked, which results in the wrong value being calculated in some instances. For example, bswap32(0x12345678) returns 0x7c563412, and bswap64(0x123456789abcdef0) returns 0xfcdefc9a7c563412. Fix this by appropriately masking the arguments to bswap16() in bswap32_gen(), and to bswap32() in bswap64_gen(). This should also silence warnings from clang. Submitted by: jh	2012-03-29 23:31:48 +00:00
Dimitry Andric	4715a95fb4	Revert sys/x86/include/endian.h to what it was before r233419, as that revision has two problems: - It can produce worse code with both clang and gcc. - It doesn't fix the actual issue introduced in r232721, which will be fixed in the next commit. Submitted by: bde, tijl and jh Pointy hat to: dim	2012-03-29 23:30:17 +00:00
John Baldwin	0d95597ca9	Use a more proper fix for enabling HT MSI mapping windows on Host-PCI bridges. Rather than blindly enabling the windows on all of them, only enable the window when an MSI interrupt is enabled for a device behind the bridge, similar to what already happens for HT PCI-PCI bridges. To implement this, each x86 Host-PCI bridge driver has to be able to locate it's actual backing device on bus 0. For ACPI, use the _ADR method to find the slot and function of the device. For the non-ACPI case, the legacy(4) driver already scans bus 0 looking for Host-PCI bridge devices. Now it saves the slot and function of each bridge that it finds as ivars that the Host-PCI bridge driver can then use in its pcib_map_msi() method. This fixes machines where non-MSI interrupts were broken by the previous round of HT MSI changes. Tested by: bapt MFC after: 1 week	2012-03-29 19:03:22 +00:00
John Baldwin	46092aeec0	Restore proper use of bounce buffers for ISA DMA. When locking was added, the call to pmap_kextract() was moved up, and as a result the code never updated the physical address to use for DMA if a bounce buffer was used. Restore the earlier location of pmap_kextract() so it takes bounce buffers into account. Tested by: kargl MFC after: 1 week	2012-03-29 18:58:02 +00:00
John Baldwin	45a225844f	Allocate the ioapics[] array dynamically since it is only needed for the duration of madt_setup_io(). This avoids having the array take up permanent space in the BSS. Inspired by: bde MFC after: 2 weeks	2012-03-28 18:53:48 +00:00
John Baldwin	5dba6ec3b3	Move the DTrace return IDT vector back up from 0x20 to 0x92. The 0x20 vector is currently dedicated to servicing IRQ 0 from the 8259A's, so it shouldn't be overloaded for DTrace. Tested by: rstone MFC after: 1 week	2012-03-28 16:32:17 +00:00
Dimitry Andric	d4ddb330c9	Fix the following clang warning in sys/dev/dcons/dcons.c, caused by the recent changes in sys/x86/include/endian.h: sys/dev/dcons/dcons.c:190:15: error: implicit conversion from '__uint32_t' (aka 'unsigned int') to '__uint16_t' (aka 'unsigned short') changes value from 1684238190 to 28526 [-Werror,-Wconstant-conversion] buf->magic = ntohl(DCONS_MAGIC); ^~~~~~~~~~~~~~~~~~ sys/sys/param.h:306:18: note: expanded from: #define ntohl(x) __ntohl(x) ^ ./x86/endian.h:128:20: note: expanded from: #define __ntohl(x) __bswap32(x) ^ ./x86/endian.h:78:20: note: expanded from: __bswap32_gen((__uint32_t)(x)) : __bswap32_var(x)) ^ ./x86/endian.h:68:26: note: expanded from: (((__uint32_t)__bswap16(x) << 16) \| __bswap16((x) >> 16)) ^ ./x86/endian.h:75:53: note: expanded from: __bswap16_gen((__uint16_t)(x)) : __bswap16_var(x))) ~~~~~~~~~~~~~ ^ This is because the __bswapXX_gen() macros (for x86) call the regular __bswapXX() macros. Since the __bswapXX_gen() variants are only called when their arguments are constant, there is no need to do that constancy check recursively. Also, it causes the above error with clang. Fix it by calling __bswap16_gen() from __bswap32_gen(), and similarly, __bswap32_gen() from __bswap64_gen(). While here, add extra parentheses around the __bswap16_gen() macro expansion, to prevent unexpected side effects.	2012-03-24 10:07:21 +00:00
John Baldwin	d8c827012c	Mark the 'lapics' and 'ioapics' arrays here static since they are private to this file. The 'lapics' array was actually shadowing a completely different 'lapics' array that is private to local_apic.c. Reported by: bde MFC after: 2 weeks	2012-03-22 12:23:32 +00:00
Tijl Coosemans	dfb1c11345	Copy amd64 sysarch.h to x86 and merge with i386 sysarch.h. Replace amd64/i386/pc98 sysarch.h with stubs.	2012-03-19 21:57:31 +00:00
Tijl Coosemans	2c7879ea84	Copy i386 specialreg.h to x86 and merge with amd64 specialreg.h. Replace amd64/i386/pc98 specialreg.h with stubs.	2012-03-19 21:34:11 +00:00
Tijl Coosemans	68156ad982	Copy i386 psl.h to x86 and replace amd64/i386/pc98 psl.h with stubs.	2012-03-19 21:29:57 +00:00
Tijl Coosemans	bcde3b9f67	Move userland bits (and some common kernel bits) from amd64 and i386 segments.h to a new x86 segments.h. Add __packed attribute to some structs (just to be sure). Also make it clear that i386 GDT and LDT entries are used in ia64 code.	2012-03-19 21:24:50 +00:00
Tijl Coosemans	6e310b206f	Eliminate ia32_reg.h by moving its contents to x86 and ia64 reg.h. Reviewed by: kib	2012-03-18 19:12:11 +00:00
Tijl Coosemans	01cd19680d	Copy i386 reg.h to x86 and merge with amd64 reg.h. Replace i386/amd64/pc98 reg.h with stubs. The tREGISTER macros are only made visible on i386. These macros are deprecated and should not be available on amd64. The i386 and amd64 versions of struct reg have been renamed to struct __reg32 and struct __reg64. During compilation either __reg32 or __reg64 is defined as reg depending on the machine architecture. On amd64 the i386 struct is also available as struct reg32 which is used in COMPAT_FREEBSD32 code. Most of compat/ia32/ia32_reg.h is now IA64 only. Reviewed by: kib (previous version)	2012-03-18 19:06:38 +00:00
Tijl Coosemans	786645078b	Move userland bits of i386 npx.h and amd64 fpu.h to x86 fpu.h. Remove FPU types from compat/ia32/ia32_reg.h that are no longer needed. Create machine/npx.h on amd64 to allow compiling i386 code that uses this header. The original npx.h and fpu.h define struct envxmm differently. Both definitions have been included in the new x86 header as struct __envxmm32 and struct __envxmm64. During compilation either __envxmm32 or __envxmm64 is defined as envxmm depending on machine architecture. On amd64 the i386 struct is also available as struct envxmm32. Reviewed by: kib	2012-03-16 20:24:30 +00:00
John Baldwin	3b22825af7	Revert the PCIe 4GB boundary issue workaround now that the proper fix is in HEAD. Ok'd by: scottl	2012-03-16 16:12:10 +00:00
Yoshihiro Takahashi	dff207f860	- Fix to build a native i386 kernel without the SMP and atpic. - Merge r232744 changes to pc98. (Allow a kernel to be built with 'nodevice atpic'.) - Move ICU related defines from x86/isa/atpic.c to x86/isa/icu.h and use them in x86/x86/intr_machdep.c. Reviewed by: jhb	2012-03-16 12:13:44 +00:00
John Baldwin	646af7c6af	Move i386's intr_machdep.c to the x86 tree and share it with amd64.	2012-03-09 20:43:29 +00:00
Dimitry Andric	63d094a7e2	Add casts to __uint16_t to the __bswap16() macros on all arches which didn't already have them. This is because the ternary expression will return int, due to the Usual Arithmetic Conversions. Such casts are not needed for the 32 and 64 bit variants. While here, add additional parentheses around the x86 variant, to protect against unintended consequences. MFC after: 2 weeks	2012-03-09 20:34:31 +00:00
Tijl Coosemans	ced8176236	Cast the expression in __bswap16(x) to __uint16_t because it is promoted to int. Reviewed by: dim	2012-03-09 16:39:34 +00:00
Tijl Coosemans	0502467707	Clean up x86 endian.h: - Remove extern "C". There are no functions with external linkage here. [1] - Rename bswapNN_const(x) to bswapNN_gen(x) to indicate that these macros are generic implementations that can take non-constant arguments. [1] - Split up __GNUCLIKE_ASM && __GNUCLIKE_BUILTIN_CONSTANT_P and deal with each separately. - Replace _LP64 with __amd64__ because asm instructions are machine dependent, not ABI dependent. Submitted by: bde [1] Reviewed by: bde	2012-03-09 11:48:56 +00:00
Tijl Coosemans	d8a023328d	Copy amd64 ptrace.h to x86 and merge with i386 ptrace.h. Replace amd64/i386/pc98 ptrace.h with stubs. For amd64 PT_GETXSTATE and PT_SETXSTATE have been redefined to match the i386 values. The old values are still supported but should no longer be used. Reviewed by: kib	2012-03-04 20:24:28 +00:00
Tijl Coosemans	21d0ce7868	Do not use INT64_C and UINT64_C to define 64 bit integer limits. They aren't defined for C++ code unless __STDC_CONSTANT_MACROS is defined. Reported by: jhb	2012-03-04 20:02:20 +00:00
Tijl Coosemans	8b4a1ed0de	Copy amd64 trap.h to x86 and replace amd64/i386/pc98 trap.h with stubs.	2012-03-04 14:12:57 +00:00
Tijl Coosemans	ee0d5ab989	Copy amd64 float.h to x86 and merge with i386 float.h. Replace amd64/i386/pc98 float.h with stubs.	2012-03-04 14:00:32 +00:00
John Baldwin	831ce4cb3d	- Change contigmalloc() to use the vm_paddr_t type instead of an unsigned long for specifying a boundary constraint. - Change bus_dma tags to use bus_addr_t instead of bus_size_t for boundary constraints. These allow boundary constraints to be fully expressed for cases where sizeof(bus_addr_t) != sizeof(bus_size_t). Specifically, it allows a driver to properly specify a 4GB boundary in a PAE kernel. Note that this cannot be safely MFC'd without a lot of compat shims due to KBI changes, so I do not intend to merge it. Reviewed by: scottl	2012-03-01 19:58:34 +00:00
Tijl Coosemans	5b2a5decd1	Copy amd64 stdarg.h to x86 and replace amd64/i386/pc98 stdarg.h with stubs.	2012-02-28 22:30:58 +00:00
Tijl Coosemans	f85ac30a3d	Copy amd64 setjmp.h to x86 and replace amd64/i386/pc98 setjmp.h with stubs.	2012-02-28 22:17:52 +00:00
Ed Maste	3f8e262e8c	Workaround for PCIe 4GB boundary issue Enforce a boundary of no more than 4GB - transfers crossing a 4GB boundary can lead to data corruption due to PCIe limitations. This change is a less-intrusive workaround that can be quickly merged back to older branches; a cleaner implementation will arrive in HEAD later but may require KPI changes. This change is based on a suggestion by jhb@. Reviewed by: scottl, jhb Sponsored by: Sandvine Incorporated MFC after: 3 days	2012-02-28 19:42:40 +00:00
Tijl Coosemans	95b1d16df5	Copy amd64 endian.h to x86 and merge with i386 endian.h. Replace amd64/i386/pc98 endian.h with stubs. In __bswap64_const(x) the conflict between 0xffUL and 0xffULL has been resolved by reimplementing the macro in terms of __bswap32(x). As a side effect __bswap64_var(x) is now implemented using two bswap instructions on i386 and should be much faster. __bswap32_const(x) has been reimplemented in terms of __bswap16(x) for consistency.	2012-02-28 19:39:54 +00:00
Tijl Coosemans	8770e9db97	Copy amd64 _stdint.h to x86 and merge with i386 _stdint.h. Replace amd64/i386/pc98 _stdint.h with stubs.	2012-02-28 18:38:33 +00:00
Tijl Coosemans	8cfa93e4be	Copy amd64 _limits.h to x86 and merge with i386 _limits.h. Replace amd64/i386/pc98 _limits.h with stubs.	2012-02-28 18:24:28 +00:00
Tijl Coosemans	8f77be2b4c	Copy amd64 _types.h to x86 and merge with i386 _types.h. Replace existing amd64/i386/pc98 _types.h with stubs.	2012-02-28 18:15:28 +00:00
John Baldwin	8fef42c511	- Panic up front if a kernel does not include 'device atpic' and an APIC is not found. - Don't panic if lapic_enable_cmc() is called and the APIC is not enabled. This can happen due to booting a kernel with APIC disabled on a CPU that supports CMCI. - Wrap a long line.	2012-02-27 17:33:16 +00:00
Alexander Kabaev	2f42a9bf0d	Fix apparent logic reversal in setting the 'auto_mode' flag. MFC after: 2 weeks	2012-02-26 21:24:27 +00:00
John Baldwin	289908743e	Fix a few bugs in the SRAT parsing code: - Actually increment ndomain when building our list of known domains so that we can properly renumber them to be 0-based and dense. - If the number of domains exceeds the configured maximum (VM_NDOMAIN), bail out of processing the SRAT and disable NUMA rather than hitting an obscure panic later. - Don't bother parsing the SRAT at all if VM_NDOMAIN is set to 1 to disable NUMA (the default). Reported by: phk (2) MFC after: 1 week	2012-01-03 20:53:58 +00:00
Ed Schouten	b66c0c3405	Get rid of kludgy per-descriptor state handling in acpi_apm. Where i386/bios/apm.c requires no per-descriptor state, the ACPI version of these device do. Instead of using hackish clone lists that leave stale device nodes lying around, use the cdevpriv API.	2011-12-05 16:08:18 +00:00
Marius Strobl	4b7ec27007	- There's no need to overwrite the default device method with the default one. Interestingly, these are actually the default for quite some time (bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9) since r52045) but even recently added device drivers do this unnecessarily. Discussed with: jhb, marcel - While at it, use DEVMETHOD_END. Discussed with: jhb - Also while at it, use __FBSDID.	2011-11-22 21:28:20 +00:00
Ed Schouten	6472ac3d8a	Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs. The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.	2011-11-07 15:43:11 +00:00
Ed Schouten	d745c852be	Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs. This means that their use is restricted to a single C file.	2011-11-07 06:44:47 +00:00
John Baldwin	4d99cfb313	Ignore SRAT memory entries if the memory range does not overlap with an existing phys_avail[] table. If a hw.physmem setting causes a memory domain to not be present in phys_avail[], the SRAT table will now be ignored rather than triggering a panic when a CPU in the missing domain tries to allocate a page. MFC after: 1 week	2011-10-05 16:03:47 +00:00
Attilio Rao	6aba400a70	Fix a deficiency in the selinfo interface: If a selinfo object is recorded (via selrecord()) and then it is quickly destroyed, with the waiters missing the opportunity to awake, at the next iteration they will find the selinfo object destroyed, causing a PF#. That happens because the selinfo interface has no way to drain the waiters before to destroy the registered selinfo object. Also this race is quite rare to get in practice, because it would require a selrecord(), a poll request by another thread and a quick destruction of the selrecord()'ed selinfo object. Fix this by adding the seldrain() routine which should be called before to destroy the selinfo objects (in order to avoid such case), and fix the present cases where it might have already been called. Sometimes, the context is safe enough to prevent this type of race, like it happens in device drivers which installs selinfo objects on poll callbacks. There, the destruction of the selinfo object happens at driver detach time, when all the filedescriptors should be already closed, thus there cannot be a race. For this case, mfi(4) device driver can be set as an example, as it implements a full correct logic for preventing this from happening. Sponsored by: Sandvine Incorporated Reported by: rstone Tested by: pluknet Reviewed by: jhb, kib Approved by: re (bz) MFC after: 3 weeks	2011-08-25 15:51:54 +00:00
Mike Silbersack	5cf8ac1bc2	Disable TSC usage inside SMP VM environments. On my VMware ESXi 4.1 environment with a core i5-2500K, operation in this mode causes timeouts from the mpt driver. Switching to the ACPI-fast timer resolves this issue. Switching the VM back to single CPU mode also works, which is why I have not disabled the TSC in that mode. I did not test with KVM or other VM environments, but I am being cautious and assuming that the TSC is not reliable in SMP mode there as well. Reviewed by: kib Approved by: re (kib) MFC after: Not applicable, the timecounter code is new for 9.x	2011-08-22 03:10:29 +00:00
John Baldwin	869e878c19	Fix build when NEW_PCIB is not defined. Submitted by: gcooper (partially) Pointy hat to: jhb	2011-07-16 14:05:34 +00:00
John Baldwin	34ff71eecd	Respect the BIOS/firmware's notion of acceptable address ranges for PCI resource allocation on x86 platforms: - Add a new helper API that Host-PCI bridge drivers can use to restrict resource allocation requests to a set of address ranges for different resource types. - For the ACPI Host-PCI bridge driver, use Producer address range resources in _CRS to enumerate valid address ranges for a given Host-PCI bridge. This can be disabled by including "hostres" in the debug.acpi.disabled tunable. - For the MPTable Host-PCI bridge driver, use entries in the extended MPTable to determine the valid address ranges for a given Host-PCI bridge. This required adding code to parse extended table entries. Similar to the new PCI-PCI bridge driver, these changes are only enabled if the NEW_PCIB kernel option is enabled (which is enabled by default on amd64 and i386). Approved by: re (kib)	2011-07-15 21:08:58 +00:00
Jung-uk Kim	08e1b4f4a9	If TSC stops ticking in C3, disable deep sleep when the user forcefully select TSC as timecounter hardware. Tested by: Fabian Keil (freebsd-listen at fabiankeil dot de)	2011-07-14 21:00:26 +00:00
John Baldwin	1368987ae4	Move {amd64,i386}/pci/pci_bus.c and {amd64,i386}/include/pci_cfgreg.h to the x86 tree. The $PIR code is still only enabled on i386 and not amd64. While here, make the qpi(4) driver on conditional on 'device pci'.	2011-06-22 21:04:13 +00:00
Jung-uk Kim	a49399a903	Set negative quality to TSC timecounter when C3 state is enabled for Intel processors unless the invariant TSC bit of CPUID is set. Intel processors may stop incrementing TSC when DPSLP# pin is asserted, according to Intel processor manuals, i. e., TSC timecounter is useless if the processor can enter deep sleep state (C3/C4). This problem was accidentally uncovered by r222869, which increased timecounter quality of P-state invariant TSC, e.g., for Core2 Duo T5870 (Family 6, Model f) and Atom N270 (Family 6, Model 1c). Reported by: Fabian Keil (freebsd-listen at fabiankeil dot de) Ian FREISLICH (ianf at clue dot co dot za) Tested by: Fabian Keil (freebsd-listen at fabiankeil dot de) - Core2 Duo T5870 (C3 state available/enabled) jkim - Xeon X5150 (C3 state unavailable)	2011-06-22 16:40:45 +00:00
Jung-uk Kim	5df88f46bb	Teach the compiler how to shift TSC value efficiently. As noted in r220631, some times compiler inserts redundant instructions to preserve unused upper 32 bits even when it is casted to a 32-bit value. Unfortunately, it seems the problem becomes more serious when it is shifted, especially on amd64.	2011-06-17 21:41:06 +00:00
Jung-uk Kim	bc8e4ad2ef	Tidy up r222866. - Re-add accidentally removed atomic op. for sysctl(9) handler. - Remove a period(`.') at the end of a debugging message. - Consistently spell "low" for "TSC-low" timecounter throughout. Pointed out by: bde	2011-06-08 23:44:59 +00:00
Jung-uk Kim	26e6537a73	Increase quality of TSC (or TSC-low) timecounter to 1000 if it is P-state invariant. For SMP case (TSC-low), it also has to pass SMP synchronization test and the CPU vendor/model has to be white-listed explicitly. Currently, all Intel CPUs and single-socket AMD Family 15h processors are listed here. Discussed with: hackers	2011-06-08 20:08:06 +00:00
Jung-uk Kim	95f2f0985b	Introduce low-resolution TSC timecounter "TSC-low". It replaces the normal TSC timecounter if TSC frequency is higher than ~4.29 MHz (or 2^32-1 Hz) or multiple CPUs are present. The "TSC-low" frequency is always lower than a preset maximum value and derived from TSC frequency (by being halved until it becomes lower than the maximum). Note the maximum value for SMP case is significantly lower than UP case because we want to reduce (rare but known) "temporal anomalies" caused by non-serialized RDTSC instruction. Normally, it is still higher than "ACPI-fast" timecounter frequency (which was default timecounter hardware for long time until r222222) to be useful.	2011-06-08 19:38:31 +00:00
Jung-uk Kim	75aa1914d5	Remove a redundant assignment since r221703.	2011-06-08 18:52:42 +00:00
Attilio Rao	bd55ede060	MFC	2011-05-09 18:53:13 +00:00
Jung-uk Kim	65e7d70b09	Implement boot-time TSC synchronization test for SMP. This test is executed when the user has indicated that the system has synchronized TSCs or it has P-state invariant TSCs. For the former case, we may clear the tunable if it fails the test to prevent accidental foot-shooting. For the latter case, we may set it if it passes the test to notify the user that it may be usable.	2011-05-09 17:34:00 +00:00
Attilio Rao	aa8b9e0706	MFC	2011-05-06 22:45:33 +00:00
John Baldwin	f9a9473702	Retire isa_setup_intr() and isa_teardown_intr() and use the generic bus versions instead. They were never needed as bus_generic_intr() and bus_teardown_intr() had been changed to pass the original child device up in 42734, but the ISA bus was not converted to new-bus until 45720.	2011-05-06 13:48:53 +00:00
Alexander Motin	00aa5aab1e	Some changes around LAPIC timer programming. This fixes heavy interrupt storm and resulting system freeze when using LAPIC timer in one-shot mode under Xen HVM. There, unlike real hardware, programming timer with zero period almost immediately causes interrupt.	2011-05-05 18:56:48 +00:00
Attilio Rao	71a19bdc64	Commit the support for removing cpumask_t and replacing it directly with cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno	2011-05-05 14:39:14 +00:00
John Baldwin	83c41143ca	Reimplement how PCI-PCI bridges manage their I/O windows. Previously the driver would verify that requests for child devices were confined to any existing I/O windows, but the driver relied on the firmware to initialize the windows and would never grow the windows for new requests. Now the driver actively manages the I/O windows. This is implemented by allocating a bus resource for each I/O window from the parent PCI bus and suballocating that resource to child devices. The suballocations are managed by creating an rman for each I/O window. The suballocated resources are mapped by passing the bus_activate_resource() call up to the parent PCI bus. Windows are grown when needed by using bus_adjust_resource() to adjust the resource allocated from the parent PCI bus. If the adjust request succeeds, the window is adjusted and the suballocation request for the child device is retried. When growing a window, the rman_first_free_region() and rman_last_free_region() routines are used to determine if the front or end of the existing I/O window is free. From using that, the smallest ranges that need to be added to either the front or back of the window are computed. The driver will first try to grow the window in whichever direction requires the smallest growth first followed by the other direction if that fails. Subtractive bridges will first attempt to satisfy requests for child resources from I/O windows (including attempts to grow the windows). If that fails, the request is passed up to the parent PCI bus directly however. The PCI-PCI bridge driver will try to use firmware-assigned ranges for child BARs first and only allocate a "fresh" range if that specific range cannot be accommodated in the I/O window. This allows systems where the firmware assigns resources during boot but later wipes the I/O windows (some ACPI BIOSen are known to do this) to "rediscover" the original I/O window ranges. The ACPI Host-PCI bridge driver has been adjusted to correctly honor hw.acpi.host_mem_start and the I/O port equivalent when a PCI-PCI bridge makes a wildcard request for an I/O window range. The new PCI-PCI bridge driver is only enabled if the NEW_PCIB kernel option is enabled. This is a transition aide to allow platforms that do not yet support bus_activate_resource() and bus_adjust_resource() in their Host-PCI bridge drivers (and possibly other drivers as needed) to use the old driver for now. Once all platforms support the new driver, the kernel option and old driver will be removed. PR: kern/143874 kern/149306 Tested by: mav	2011-05-03 17:37:24 +00:00
Jung-uk Kim	a990fbf972	Fix build with clang. Please note there is an LLVM/Clang PR: http://llvm.org/bugs/show_bug.cgi?id=9379 Reported by: rpaulo, dim	2011-05-02 17:08:36 +00:00
John Baldwin	d2c9344ff9	Add implementations of BUS_ADJUST_RESOURCE() to the PCI bus driver, generic PCI-PCI bridge driver, x86 nexus driver, and x86 Host to PCI bridge drivers.	2011-05-02 14:13:12 +00:00
John Baldwin	b67d11bbcc	Change rman_manage_region() to actually honor the rm_start and rm_end constraints on the rman and reject attempts to manage a region that is out of range. - Fix various places that set rm_end incorrectly (to ~0 or ~0u instead of ~0ul). - To preserve existing behavior, change rman_init() to set rm_start and rm_end to allow managing the full range (0 to ~0ul) if they are not set by the caller when rman_init() is called.	2011-04-29 18:41:21 +00:00
Jung-uk Kim	5da5812ba7	Detect VMware guest and set the TSC frequency as reported by the hypervisor. VMware products virtualize TSC and it run at fixed frequency in so-called "apparent time". Although virtualized i8254 also runs in apparent time, TSC calibration always gives slightly off frequency because of the complicated timer emulation and lost-tick correction mechanism.	2011-04-29 18:20:12 +00:00
Jung-uk Kim	5ac44f727f	Turn off periodic recalibration of CPU ticker frequency if it is invariant.	2011-04-28 17:56:02 +00:00
Attilio Rao	2be767e069	Add the watchdogs patting during the (shutdown time) disk syncing and disk dumping. With the option SW_WATCHDOG on, these operations are doomed to let watchdog fire, fi they take too long. I implemented the stubs this way because I really want wdog_kern_* KPI to not be dependant by SW_WATCHDOG being on (and really, the option only enables watchdog activation in hardclock) and also avoid to call them when not necessary (avoiding not-volountary watchdog activations). Sponsored by: Sandvine Incorporated Discussed with: emaste, des MFC after: 2 weeks	2011-04-28 16:02:05 +00:00
Jung-uk Kim	43d645f96b	Use ACPI-supplied CPU frequencies instead of estimated ones as we are about to use other values from the same table anyway. MFC after: 3 days	2011-04-27 00:32:35 +00:00
Jung-uk Kim	8143750196	Use newly added rdtsc32() for DELAY(9) as well.	2011-04-14 19:11:45 +00:00
Jung-uk Kim	0e78005e5c	Work around an emulator problem where virtual CPU advertises TSC is P-state invariant and APERF/MPERF MSRs exist but these MSRs never tick. When we calculate effective frequency from cpu_est_clockrate(), it caused panic of division-by-zero. Now we test whether these MSRs actually increase to avoid such foot-shooting. Reported by: dim Tested by: dim	2011-04-14 17:50:26 +00:00
Jung-uk Kim	727c7b2d66	Use newly added rdtsc32() for the timecounter_get_t method.	2011-04-14 17:08:23 +00:00
Jung-uk Kim	5331d61da4	Add some tunable descriptions about x86 timers. Requested by: arundel	2011-04-14 00:07:08 +00:00
Jung-uk Kim	e94d5ad227	Do not use TSC for DELAY(9) if it not P-state invariant to avoid possible foot-shooting. DELAY() becomes unreliable when TSC frequency varies wildly, especially cpufreq(4) and powerd(8) are used at the same time.	2011-04-12 22:41:52 +00:00
Jung-uk Kim	155094d77a	Probe capability to find effective frequency. When the TSC is P-state invariant, APERF/MPERF ratio can be used to find effective frequency.	2011-04-12 22:15:46 +00:00
Jung-uk Kim	a4e4127f42	Add a new tunable 'machdep.disable_tsc_calibration' to allow skipping TSC frequency calibration. For Intel processors, if brand string from CPUID contains its nominal frequency, this frequency is used instead.	2011-04-12 21:08:34 +00:00
Jung-uk Kim	57d7a7fb0a	Merge two similar functions to reduce duplication.	2011-04-11 19:27:44 +00:00
Jung-uk Kim	80c2cdcffe	Refactor DELAYDEBUG as it is only useful for correcting i8254 frequency.	2011-04-08 19:54:29 +00:00
Jung-uk Kim	3453537fa5	Use atomic load & store for TSC frequency. It may be overkill for amd64 but safer for i386 because it can be easily over 4 GHz now. More worse, it can be easily changed by user with 'machdep.tsc_freq' tunable (directly) or cpufreq(4) (indirectly). Note it is intentionally not used in performance critical paths to avoid performance regression (but we should, in theory). Alternatively, we may add "virtual TSC" with lower frequency if maximum frequency overflows 32 bits (and ignore possible incoherency as we do now).	2011-04-07 23:28:28 +00:00
Jung-uk Kim	7ebbcb21ba	Revert r219676. Requested by: jhb, bde	2011-03-16 16:44:08 +00:00
Jung-uk Kim	a8f8643e3a	Do not let machdep.tsc_freq modify tsc_freq itself. It is bad for i386 as it does not operate atomically. Actually, it serves no purpose. Noticed by: bde	2011-03-15 19:47:20 +00:00
Jung-uk Kim	38b8542ca9	Deprecate tsc_present as the last of its real consumers finally disappeared.	2011-03-15 17:19:52 +00:00
Jung-uk Kim	856e88c1f5	When TSC is unavailable, broken or disabled and the current timecounter has better quality than i8254 timer, use it for DELAY(9).	2011-03-14 22:05:59 +00:00
Jung-uk Kim	79422085d4	Add a tunable "machdep.disable_tsc" to turn off TSC. Specifically, it turns off boot-time CPU frequency calibration, DELAY(9) with TSC, and using TSC as a CPU ticker. Note tsc_present does not change by this tunable.	2011-03-11 00:44:32 +00:00
Jung-uk Kim	a106a27c6a	Turn off pointless P-state invariant TSC detection based on CPU model on a virtual machine.	2011-03-10 23:06:13 +00:00
Jung-uk Kim	bc34c87e81	Deprecate rarely used tsc_is_broken. Instead, we zero out tsc_freq because it is almost always used with tsc_freq any way.	2011-03-10 20:02:58 +00:00
Jung-uk Kim	49abdda9b7	Set C1 "I/O then Halt" capability bit for Intel EIST. Some broken BIOSes refuse to load external SSDTs if this bit is unset for _PDC. It seems Linux and OpenSolaris did the same long ago. MFC after: 1 week	2011-02-25 23:14:24 +00:00
Rebecca Cran	974206cf70	Fix typos - remove duplicate "is". PR: docs/154934 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days	2011-02-23 09:22:33 +00:00
John Baldwin	e119a7bcdb	Use a dedicated taskqueue with a thread that runs at a software-interrupt priority for the periodic polling of the machine check registers.	2011-02-03 13:09:22 +00:00
Matthew D Fleming	cbc134ad03	Introduce signed and unsigned version of CTLTYPE_QUAD, renaming existing uses. Rename sysctl_handle_quad() to sysctl_handle_64().	2011-01-19 23:00:25 +00:00
John Baldwin	072e9838e2	If an interrupt on an I/O APIC is moved to a different CPU after it has started to execute, it seems that the corresponding ISR bit in the "old" local APIC can be cleared. This causes the local APIC interrupt routine to fail to find an interrupt to service. Rather than panic'ing in this case, simply return from the interrupt without sending an EOI to the local APIC. If there are any other pending interrupts in other ISR registers, the local APIC will assert a new interrupt. Tested by: steve	2011-01-13 17:00:22 +00:00
Matthew D Fleming	5bc9ca019a	Revert to using bus_size_t for the bounce_zone's alignment member. Reuqested by: jhb	2011-01-13 00:52:57 +00:00
Matthew D Fleming	407dcb49df	Fix a brain fart. Since this file is shared between i386 and amd64, a bus_size_t may be 32 or 64 bits. Change the bounce_zone alignment field to explicitly be 32 bits, as I can't really imagine a DMA device that needs anything close to 2GB alignment of data.	2011-01-12 21:08:49 +00:00
Matthew D Fleming	fbbb13f962	sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly. Commit the kernel changes.	2011-01-12 19:54:19 +00:00
John Baldwin	58ccf5b41c	Remove unneeded includes of <sys/linker_set.h>. Other headers that use it internally contain nested includes. Reviewed by: bde	2011-01-11 13:59:06 +00:00
Tijl Coosemans	d22e78d6b9	Copy powerpc/include/_inttypes.h to x86 and replace i386/amd64/pc98 headers with stubs. Approved by: kib (mentor)	2011-01-08 18:09:48 +00:00
John Baldwin	e1070bf509	Drop the icu_lock spinlock while pausing briefly after masking the interrupt in the I/O APIC before moving it to a different CPU. If the interrupt had been triggered by the I/O APIC after locking icu_lock but before we masked the pin in the I/O APIC, then this could cause the interrupt to be pending on the "old" CPU and it would finally trigger after we had moved the interrupt to the new CPU. This could cause us to panic as there was no interrupt source associated with the old IDT vector on the old CPU. Dropping the lock after the interrupt is masked but before it is moved allows the interrupt to fire and be handled in this case before it is moved. Tested by: Daniel Braniss danny of cs huji ac il MFC after: 1 week	2010-12-23 15:17:28 +00:00
Tijl Coosemans	81bd5041a2	Merge amd64 and i386 bus.h and move the resulting header to x86. Replace the original amd64 and i386 headers with stubs. Rename (AMD64\|I386)_BUS_SPACE_* to X86_BUS_SPACE_* everywhere. Reviewed by: imp (previous version), jhb Approved by: kib (mentor)	2010-12-20 16:39:43 +00:00
John Baldwin	686b1e6bc0	Small style fixes: - Avoid side-effect assignments in if statements when possible. - Don't use ! to check for NULL pointers, explicitly check against NULL. - Explicitly check error return values against 0. - Don't use INTR_MPSAFE for interrupt handlers with only filters as it is meaningless. - Remove unneeded function casts.	2010-12-16 17:05:28 +00:00
Jung-uk Kim	cc0eda4efd	Remove AMD Family 0Fh, Model 6Bh, Stepping 2 from the list of P-state invariant CPUs. I do not believe this model is P-state invariant any more. Maybe cpufreq(4) was broken at the time of commit. :-(	2010-12-09 21:29:36 +00:00
Colin Percival	91ff9dc058	Replace i386/i386/busdma_machdep.c and amd64/amd64/busdma_machdep.c (which are identical) with a single x86/x86/busdma_machdep.c.	2010-12-09 06:41:50 +00:00
Jung-uk Kim	dd7d207dcb	Merge sys/amd64/amd64/tsc.c and sys/i386/i386/tsc.c and move to sys/x86/x86. Discussed with: avg	2010-12-08 00:09:24 +00:00
Tijl Coosemans	ce4ec51dbe	Merge amd64/i386 _align.h by aligning on the size of register_t (copied from powerpc). Reviewed by: imp, jhb Approved by: kib (mentor)	2010-11-26 10:59:20 +00:00
Andriy Gapon	c0e4a357a2	x86/local_apic: use newly added ARAT bit definition ARAT: APIC-Timer-always-running feature. Suggested by: mav MFC after: 12 days	2010-11-23 14:36:14 +00:00
Andriy Gapon	40934baa60	hwpstate: use CPU_FOREACH when binding to all available processors Also, add a comment mentioning _PSD - on some systems it's enough to put one logical CPU into a particular P-state to make other CPUs in the same domain to enter that P-state. Also, call sched_unbind() after the loop - sched_bind() automatically rebinds from previous CPU to a new one, and the new arrangement of code is safer against early loop exit. Plus one minor style nit. MFC after: 10 days	2010-11-16 12:43:45 +00:00
Jung-uk Kim	19da400c64	Move identical copies of apm_bios.h to sys/x86/include, replace them with stubs, and adjust PC98 stub accordingly. Reviewed by: imp, nyan	2010-11-11 19:36:21 +00:00
Andriy Gapon	b3fa872420	make it possible to actually enable hwpstate_verbose Either via the tunable or the sysctl. MFC after: 3 days	2010-11-11 17:30:49 +00:00
Jung-uk Kim	93a8847473	Make APM emulation look more closer to its origin. Use device_get_softc(9) instead of hardcoding acpi(4) unit number as we have device_t for it.	2010-11-10 18:50:12 +00:00
Jung-uk Kim	7c2bf852d7	Refactor acpi_machdep.c for amd64 and i386, move APM emulation into a new file acpi_apm.c, and place it on sys/x86/acpica.	2010-11-10 01:29:56 +00:00
Attilio Rao	fcb250f392	Move the mptable.h under x86/include/. Sponsored by: Sandvine Incorporated MFC after: 14 days	2010-11-09 20:28:09 +00:00
Jung-uk Kim	cedd86cafa	Now OsdEnvironment.c is identical on amd64 and i386. Move it to a new home.	2010-11-09 00:27:18 +00:00
John Baldwin	13e25cb7a5	Move the MADT parser for amd64 and i386 to sys/x86/acpica now that it is identical on both platforms.	2010-11-08 20:57:02 +00:00
John Baldwin	c5b0b5fc6b	Sync the APIC startup sequence with amd64: - Register APIC enumerators at SI_SUB_TUNABLES - 1 instead of SI_SUB_CPU - 1. - Probe CPUs at SI_SUB_TUNABLES - 1. This allows i386 to set a truly accurate mp_maxid value rather than always setting it to MAXCPU - 1.	2010-11-08 20:35:09 +00:00
John Baldwin	95b3d590e2	Only dump the values of the PMC and CMCI local vector table entries on a local APIC if those LVT entries are valid. This quiets spurious illegal register local APIC errors during boot on a CPU that doesn't support those vectors. MFC after: 1 week	2010-11-08 20:03:51 +00:00
John Baldwin	5b867e813a	Cosmetic change to revert one of my earlier ones. #if __i386__ && PAE is identical to just #if PAE since PAE is only a valid option for i386. Submitted by: attilio	2010-11-02 20:16:41 +00:00
John Baldwin	239da85bbc	Further tweaks to the ram_attach() routine: - Use > 2^32 - 1 instead of >= when checking for memory regions above 4G. - Skip SMAP entries > 4G on i386 rather than breaking out of the loop since SMAP entries are not guaranteed to be in order. - Remove 'i' and loop over 'rid' directly in the dump_avail[] case. - Only check for 4G regions in the dump_avail[] case on i386 if PAE is enabled since vm_paddr_t is 32-bit in the !PAE case. Submitted by: alc	2010-11-02 17:56:16 +00:00
John Baldwin	204404e890	Skip SMAP regions above 4GB on i386 since they will not fit into a long. While here, update some comments to better explain the new code flow. Tested by: dhw	2010-11-02 13:04:25 +00:00
John Baldwin	32c3d3b6e6	Move <machine/apicreg.h> to <x86/apicreg.h>.	2010-11-01 18:18:46 +00:00
John Baldwin	5ecdb3c46b	Move the <machine/mca.h> header to <x86/mca.h>.	2010-11-01 17:40:35 +00:00
Attilio Rao	4e30bd6244	- Merge ram_attach() implementation for i386 and amd64 - Rename RES_BUS_SPACE_* into BUS_SPACE_* for consistency - Trim out an unnecessary checking condition Sponsored by: Sandvine Incorporated Requested and reviewed by: jhb	2010-10-29 18:33:43 +00:00
Attilio Rao	ba2a27351b	Merge nexus.c from amd64 and i386 to x86 subtree. Sponsored by: Sandvine Incorporated Tested by: gianni	2010-10-28 16:31:39 +00:00
Attilio Rao	a3da97926d	Merge the mptable support from MD bits to x86 subtree. Sponsored by: Sandvine Incorporated Discussed with: jhb	2010-10-28 07:58:06 +00:00
Attilio Rao	b2724beede	Style fix. Reported by: bde, dim	2010-10-26 18:01:28 +00:00
Attilio Rao	61ba91df0d	Remove usage of PRI* macro for style compliancy. Requested by: bde, jhb Sponsored by: Sandvine Incorporated	2010-10-26 16:16:15 +00:00
Attilio Rao	256439c972	Merge dump_machdep.c i386/amd64 under the x86 subtree. Sponsored by: Sandvine Incorporated Tested by: gianni	2010-10-26 12:46:26 +00:00
John Baldwin	0689bdcc19	Use 'saveintr' instead of 'savecrit' or 'eflags' to hold the state returned by intr_disable(). Requested by: bde	2010-10-25 15:31:13 +00:00
Andriy Gapon	2b89f1fc9e	atrtc: remove (pre-)historic check of RTC NVRAM at address 0x0e Old scrolls tell that once upon a time IBM AT BIOS was known to put some useful system diagnostic information into RTC NVRAM. It is not really known if and for how long PC BIOSes followed that convention, but I believe that many, if not all, modern BIOSes do not do that any more (not mentioning other types of x86 firmware). Some diagnostic bits don't even make any sense any longer. The check results in confusing messages upon boot on some systems. So I am removing it. Discussed with: bde, jhb, mav MFC after: 3 weeks	2010-10-16 10:45:36 +00:00
Alexander Motin	d3979248ac	Restore pre-r212778 optimization, skipping timer reprogramming when it is not neccessary. It allows to avoid time counter jump of up to 1/18s, when base frequency slightly tuned via machdep.i8254_freq sysctl. Fix few style things. Suggested by: bde	2010-09-18 07:36:43 +00:00
Alexander Motin	9500655e5a	Add one-shot mode support to attimer (i8254) event timer. Unluckily, using one-shot mode is impossible, when same hardware used for time counting. Introduce new tunable hint.attimer.0.timecounter, setting which to 0 disables i8254 time counter and allows one-shot mode. Note, that on some systems there may be no other reliable enough time counters, so this tunable should be used with understanding.	2010-09-17 04:48:50 +00:00
Alexander Motin	c59528330a	Few whitespace cleanups and comments tunings. Submitted by: arundel	2010-09-16 02:59:25 +00:00
Alexander Motin	a157e42516	Refactor timer management code with priority to one-shot operation mode. The main goal of this is to generate timer interrupts only when there is some work to do. When CPU is busy interrupts are generating at full rate of hz + stathz to fullfill scheduler and timekeeping requirements. But when CPU is idle, only minimum set of interrupts (down to 8 interrupts per second per CPU now), needed to handle scheduled callouts is executed. This allows significantly increase idle CPU sleep time, increasing effect of static power-saving technologies. Also it should reduce host CPU load on virtualized systems, when guest system is idle. There is set of tunables, also available as writable sysctls, allowing to control wanted event timer subsystem behavior: kern.eventtimer.timer - allows to choose event timer hardware to use. On x86 there is up to 4 different kinds of timers. Depending on whether chosen timer is per-CPU, behavior of other options slightly differs. kern.eventtimer.periodic - allows to choose periodic and one-shot operation mode. In periodic mode, current timer hardware taken as the only source of time for time events. This mode is quite alike to previous kernel behavior. One-shot mode instead uses currently selected time counter hardware to schedule all needed events one by one and program timer to generate interrupt exactly in specified time. Default value depends of chosen timer capabilities, but one-shot mode is preferred, until other is forced by user or hardware. kern.eventtimer.singlemul - in periodic mode specifies how much times higher timer frequency should be, to not strictly alias hardclock() and statclock() events. Default values are 2 and 4, but could be reduced to 1 if extra interrupts are unwanted. kern.eventtimer.idletick - makes each CPU to receive every timer interrupt independently of whether they busy or not. By default this options is disabled. If chosen timer is per-CPU and runs in periodic mode, this option has no effect - all interrupts are generating. As soon as this patch modifies cpu_idle() on some platforms, I have also refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions (if supported) under high sleep/wakeup rate, as fast alternative to other methods. It allows SMP scheduler to wake up sleeping CPUs much faster without using IPI, significantly increasing performance on some highly task-switching loads. Tested by: many (on i386, amd64, sparc64 and powerc) H/W donated by: Gheorghe Ardelean Sponsored by: iXsystems, Inc.	2010-09-13 07:25:35 +00:00
John Baldwin	e83ea6241a	Each processor socket in a QPI system has a special PCI bus for the "uncore" devices (such as the memory controller) in that socket. Stop hardcoding support for two busses, but instead start probing buses at domain 0, bus 255 and walk down until a bus probe fails. Also, do not probe a bus if it has already been enumerated elsewhere (e.g. if ACPI ever enumerates these buses in the future).	2010-09-07 13:50:02 +00:00
Rui Paulo	400dda6646	When DTrace is enabled, make sure we don't overwrite the IDT_DTRACE_RET entry with an IRQ for some hardware component. Reviewed by: jhb Sponsored by: The FreeBSD Foundation	2010-08-30 18:12:21 +00:00
John Baldwin	8bddaf9007	Correctly ensure that the CPU family is 0x6, not non-zero. Submitted by: Dimitry Andric	2010-08-25 20:37:58 +00:00
John Baldwin	c2175767b7	Intel QPI chipsets actually provide two extra "non-core" PCI buses that provide PCI devices for various hardware such as memory controllers, etc. These PCI buses are not enumerated via ACPI however. Add qpi(4) psuedo bus and Host-PCI bridge drivers to enumerate these buses. Currently the driver uses the CPU ID to determine the bridges' presence. In collaboration with: Joseph Golio @ Isilon Systems MFC after: 2 weeks	2010-08-25 19:12:05 +00:00
Alexander Motin	733cb5ec90	Enable timer interrupt before starting timer. This allows to handle very short periods without interrupt loss.	2010-08-24 16:08:01 +00:00
John Baldwin	6676877bd9	When performing a sanity check on the SRAT table to ensure that each memory domain has an assigned CPU, ignore disabled CPUs. Previously disabled CPUs were counted as being in domain 0. Reported by: mdf	2010-07-29 17:37:35 +00:00
John Baldwin	a955c461ad	The corrected error count field is dependent on CMCI, not TES. MFC after: 1 week	2010-07-28 21:52:09 +00:00
John Baldwin	dd540b4623	Add a parser for the ACPI SRAT table for amd64 and i386. It sets PCPU(domain) for each CPU and populates a mem_affinity array suitable for the NUMA support in the physical memory allocator. Reviewed by: alc	2010-07-27 20:40:46 +00:00
Alexander Motin	017cb944b1	Increment td->td_intr_nesting_level for LAPIC timer interrupts. Among other things it hints SCHED_ULE to run clock swi handlers on their native CPUs, avoiding many unneeded IPI_PREEMPT calls.	2010-07-24 10:49:59 +00:00
Alexander Motin	599cf0f197	Fix several un-/signedness bugs of r210290 and r210293. Add one more check.	2010-07-20 15:48:29 +00:00
Alexander Motin	51636352b6	Extend timer driver API to report also minimal and maximal supported period lengths. Make MI wrapper code to validate periods in request. Make kernel clock management code to honor these hardware limitations while choosing hz, stathz and profhz values.	2010-07-20 10:58:56 +00:00
Alexander Motin	28ab822d8a	Move timeevents.c to MI code, as it is not x86-specific. I already have it working on Marvell ARM SoCs, and it would be nice to unify timer code between more platforms.	2010-07-14 13:31:27 +00:00
Alexander Motin	ebda1414ec	Remove some unneeded includes. Code now can be built on ARM.	2010-07-14 10:49:14 +00:00
Alexander Motin	8a6870808d	Rise knowledge about curthread->td_intr_frame by one step. Make timer callback argument really opaque. Not repeat interrupt handler's problem in case somebody will ever need to have both argument and frame.	2010-07-13 12:46:06 +00:00
Alexander Motin	75e24dd8ce	Unify pc98 event timer code with the rest of x86. Reviewed by: nyan@	2010-07-13 06:57:27 +00:00
Alexander Motin	91751b1a86	Instead of deleting existing IRQ resource, which is not really working for ACPI bus, find wanted IRQ rid or spare one. This should fix panic during boot on systems reporting fancy IRQ numbers for attimer and atrtc.	2010-07-12 06:46:17 +00:00
Alexander Motin	a2d81f6d1f	Make kernel panic with reasonable message if no usable event timer found.	2010-07-11 17:08:37 +00:00
Alexander Motin	a7d6757c3e	Allow attimer to be hinted at ISA if not reported by ISA PNP or ACPI. Rephrase respective atrtc code same way to be more readable.	2010-07-01 18:59:05 +00:00
Alexander Motin	6019ba4e4b	Rework r209456: Instead of using fake rid (which ISA doesn't like), delete untrusted IRQ resource and let it be recreated.	2010-07-01 18:51:18 +00:00
Alexander Motin	926911c8ff	Do not trust IRQ reported by ACPI. There are cases when it is wrong.	2010-06-23 05:43:21 +00:00
Alexander Motin	49ed68bbf3	Add "legacy route" support to HPET driver. When enabled, this mode makes HPET to steal IRQ0 from i8254 and IRQ8 from RTC timers. It can be suitable for HPETs without FSB interrupts support, as it gives them two unshared IRQs. It allows them to provide one per-CPU event timer on dual-CPU system, that should be suitable for further tickless kernels. To enable it, such lines may be added to /boot/loader.conf: hint.atrtc.0.clock=0 hint.attimer.0.clock=0 hint.hpet.0.legacy_route=1	2010-06-22 19:42:27 +00:00
Alexander Motin	df471e067f	Fix i386 LINT build broken by r209371. There appeared such legacy thing as APM, that somehow breaking RTC.	2010-06-21 19:53:47 +00:00
Alexander Motin	875b8844be	Implement new event timers infrastructure. It provides unified APIs for writing event timer drivers, for choosing best possible drivers by machine independent code and for operating them to supply kernel with hardclock(), statclock() and profclock() events in unified fashion on various hardware. Infrastructure provides support for both per-CPU (independent for every CPU core) and global timers in periodic and one-shot modes. MI management code at this moment uses only periodic mode, but one-shot mode use planned for later, as part of tickless kernel project. For this moment infrastructure used on i386 and amd64 architectures. Other archs are welcome to follow, while their current operation should not be affected. This patch updates existing drivers (i8254, RTC and LAPIC) for the new order, and adds event timers support into the HPET driver. These drivers have different capabilities: LAPIC - per-CPU timer, supports periodic and one-shot operation, may freeze in C3 state, calibrated on first use, so may be not exactly precise. HPET - depending on hardware can work as per-CPU or global, supports periodic and one-shot operation, usually provides several event timers. i8254 - global, limited to periodic mode, because same hardware used also as time counter. RTC - global, supports only periodic mode, set of frequencies in Hz limited by powers of 2. Depending on hardware capabilities, drivers preferred in following orders, either LAPIC, HPETs, i8254, RTC or HPETs, LAPIC, i8254, RTC. User may explicitly specify wanted timers via loader tunables or sysctls: kern.eventtimer.timer1 and kern.eventtimer.timer2. If requested driver is unavailable or unoperational, system will try to replace it. If no more timers available or "NONE" specified for second, system will operate using only one timer, multiplying it's frequency by few times and uing respective dividers to honor hz, stathz and profhz values, set during initial setup.	2010-06-20 21:33:29 +00:00
Alexander Motin	5ec55931d6	Core i5, same as previously Core2Duo, found to not set P-state for single core lower then set on other cores. Do not try to test P-states on attach on SMP systems. It is hopeless now and will just pollute verbose logs. If needed, check still can be forced via loader tunable.	2010-06-19 13:09:42 +00:00
John Baldwin	61d3f0bab2	Restore the machine check register banks on resume. For banks being monitored via CMCI, reset the interrupt threshold to 1 on resume. Reviewed by: jkim MFC after: 2 weeks	2010-06-15 18:51:41 +00:00
Alexander Motin	93fc07b434	Virtualize pci_remap_msi_irq() call from general MSI code. It allows MSI (FSB interrupts) to be used by non-PCI devices, such as HPET.	2010-06-14 07:10:37 +00:00
John Baldwin	3aa6d94e0c	Update several places that iterate over CPUs to use CPU_FOREACH().	2010-06-11 18:46:34 +00:00
Alexander Motin	ae834fc9ba	Do not disable edge-triggered interrupts before migration. DELAY() with interrupt disabled highly probable causes interrupt loss.	2010-06-10 17:04:01 +00:00
John Baldwin	b9cd2f771a	Move the MD support for PCI message signalled interrupts to the x86 tree as it is identical for i386 and amd64.	2010-06-08 18:36:03 +00:00
John Baldwin	2465e30f0c	Move the machine check support code to the x86 tree since it is identical on i386 and amd64. Requested by: alc	2010-06-08 18:04:07 +00:00
John Baldwin	53a908cb07	Move the I/O APIC code to the x86 tree since it is identical on i386 and amd64.	2010-06-08 17:51:21 +00:00
John Baldwin	58ccad7ddc	Add support for corrected machine check interrupts. CMCI is a new local APIC interrupt that fires when a threshold of corrected machine check events is reached. CMCI also includes a count of events when reporting corrected errors in the bank's status register. Note that individual banks may or may not support CMCI. If they do, each bank includes its own threshold register that determines when the interrupt fires. Currently the code uses a very simple strategy where it doubles the threshold on each interrupt until it succeeds in throttling the interrupt to occur only once a minute (this interval can be tuned via sysctl). The threshold is also adjusted on each hourly poll which will lower the threshold once events stop occurring. Tested by: Sailaja Bangaru sbappana at yahoo com MFC after: 1 month	2010-05-24 15:45:05 +00:00
Alexander Motin	dbd55f3ff0	- Implement MI helper functions, dividing one or two timer interrupts with arbitrary frequencies into hardclock(), statclock() and profclock() calls. Same code with minor variations duplicated several times over the tree for different timer drivers and architectures. - Switch all x86 archs to new functions, simplifying the code and removing extra logic from timer drivers. Other archs are also welcome.	2010-05-24 11:40:49 +00:00
Alexander Motin	ad384d144a	Restore different APIC init orders for i386 and amd64 unified in r208452. Seems noone of them contents both arch for different reasons. Submitted by: kib@	2010-05-24 01:49:00 +00:00
Alexander Motin	fa1ed4bd1a	Unify local_apic.c for x86 archs,	2010-05-23 17:45:01 +00:00
Rui Paulo	f79727118c	Fix another instance of lapic_cyclic_clock_func.	2010-04-20 21:04:57 +00:00
Attilio Rao	17586b1af8	Default the machdep.lapic_allclocks to be enabled in order to cope with broken atrtc. Now if you want more correct stats on profhz and stathz it may be disabled by setting to 0. Reported by: A. Akephalos <akephalos dot akephalos at gmail dot com>, Jakub Lach <jakub_lach at mailplus dot pl> MFC: 1 week	2010-04-09 14:22:09 +00:00
Attilio Rao	306c0c6ea0	Improving the clocks auto-tunning by firstly checking if the atrtc may be correctly initialized and just then assign to softclock/profclock. Right now, some atrtc seems reporting strange diagnostic error* making the current pattern bogus. In order to do that cleanly, lapic_setup_clock(), on both ia32 and amd64, now accepts as arguments the desired sources to handle, and returns the actual ones (LAPIC_CLOCK_NONE is forbidden because otherwise there is no meaning in calling such function). This allows to bring out into commont x86 code the handling part for machdep.lapic_allclocks tunable, which is retained. Sponsored by: Sandvine Incorporated Tested by: yongari, Richard Todd <rmtodd at ichotolot dot servalan dot com> MFC: 3 weeks X-MFC: r202387, 204309	2010-03-03 17:13:29 +00:00
Attilio Rao	3258030144	Introduce the new kernel sub-tree x86 which should contain all the code shared and generalized between our current amd64, i386 and pc98. This is just an initial step that should lead to a more complete effort. For the moment, a very simple porting of cpufreq modules, BIOS calls and the whole MD specific ISA bus part is added to the sub-tree but ideally a lot of code might be added and more shared support should grow. Sponsored by: Sandvine Incorporated Reviewed by: emaste, kib, jhb, imp Discussed on: arch MFC: 3 weeks	2010-02-25 14:13:39 +00:00

... 2 3 4 5 6 ...

330 Commits