freebsd-skq

Author	SHA1	Message	Date
jhb	6d5201fff5	Fix a few bugs in the SRAT parsing code: - Actually increment ndomain when building our list of known domains so that we can properly renumber them to be 0-based and dense. - If the number of domains exceeds the configured maximum (VM_NDOMAIN), bail out of processing the SRAT and disable NUMA rather than hitting an obscure panic later. - Don't bother parsing the SRAT at all if VM_NDOMAIN is set to 1 to disable NUMA (the default). Reported by: phk (2) MFC after: 1 week	2012-01-03 20:53:58 +00:00
ed	8005c0f481	Get rid of kludgy per-descriptor state handling in acpi_apm. Where i386/bios/apm.c requires no per-descriptor state, the ACPI version of these device do. Instead of using hackish clone lists that leave stale device nodes lying around, use the cdevpriv API.	2011-12-05 16:08:18 +00:00
marius	17e14c6132	- There's no need to overwrite the default device method with the default one. Interestingly, these are actually the default for quite some time (bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9) since r52045) but even recently added device drivers do this unnecessarily. Discussed with: jhb, marcel - While at it, use DEVMETHOD_END. Discussed with: jhb - Also while at it, use __FBSDID.	2011-11-22 21:28:20 +00:00
ed	0c56cf839d	Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs. The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.	2011-11-07 15:43:11 +00:00
ed	e97eae1577	Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs. This means that their use is restricted to a single C file.	2011-11-07 06:44:47 +00:00
jhb	1448b25b0d	Ignore SRAT memory entries if the memory range does not overlap with an existing phys_avail[] table. If a hw.physmem setting causes a memory domain to not be present in phys_avail[], the SRAT table will now be ignored rather than triggering a panic when a CPU in the missing domain tries to allocate a page. MFC after: 1 week	2011-10-05 16:03:47 +00:00
attilio	683d7a54ce	Fix a deficiency in the selinfo interface: If a selinfo object is recorded (via selrecord()) and then it is quickly destroyed, with the waiters missing the opportunity to awake, at the next iteration they will find the selinfo object destroyed, causing a PF#. That happens because the selinfo interface has no way to drain the waiters before to destroy the registered selinfo object. Also this race is quite rare to get in practice, because it would require a selrecord(), a poll request by another thread and a quick destruction of the selrecord()'ed selinfo object. Fix this by adding the seldrain() routine which should be called before to destroy the selinfo objects (in order to avoid such case), and fix the present cases where it might have already been called. Sometimes, the context is safe enough to prevent this type of race, like it happens in device drivers which installs selinfo objects on poll callbacks. There, the destruction of the selinfo object happens at driver detach time, when all the filedescriptors should be already closed, thus there cannot be a race. For this case, mfi(4) device driver can be set as an example, as it implements a full correct logic for preventing this from happening. Sponsored by: Sandvine Incorporated Reported by: rstone Tested by: pluknet Reviewed by: jhb, kib Approved by: re (bz) MFC after: 3 weeks	2011-08-25 15:51:54 +00:00
silby	ee4d73add4	Disable TSC usage inside SMP VM environments. On my VMware ESXi 4.1 environment with a core i5-2500K, operation in this mode causes timeouts from the mpt driver. Switching to the ACPI-fast timer resolves this issue. Switching the VM back to single CPU mode also works, which is why I have not disabled the TSC in that mode. I did not test with KVM or other VM environments, but I am being cautious and assuming that the TSC is not reliable in SMP mode there as well. Reviewed by: kib Approved by: re (kib) MFC after: Not applicable, the timecounter code is new for 9.x	2011-08-22 03:10:29 +00:00
jhb	0710dab3a6	Fix build when NEW_PCIB is not defined. Submitted by: gcooper (partially) Pointy hat to: jhb	2011-07-16 14:05:34 +00:00
jhb	b75d5a0ef9	Respect the BIOS/firmware's notion of acceptable address ranges for PCI resource allocation on x86 platforms: - Add a new helper API that Host-PCI bridge drivers can use to restrict resource allocation requests to a set of address ranges for different resource types. - For the ACPI Host-PCI bridge driver, use Producer address range resources in _CRS to enumerate valid address ranges for a given Host-PCI bridge. This can be disabled by including "hostres" in the debug.acpi.disabled tunable. - For the MPTable Host-PCI bridge driver, use entries in the extended MPTable to determine the valid address ranges for a given Host-PCI bridge. This required adding code to parse extended table entries. Similar to the new PCI-PCI bridge driver, these changes are only enabled if the NEW_PCIB kernel option is enabled (which is enabled by default on amd64 and i386). Approved by: re (kib)	2011-07-15 21:08:58 +00:00
jkim	96d6cc9832	If TSC stops ticking in C3, disable deep sleep when the user forcefully select TSC as timecounter hardware. Tested by: Fabian Keil (freebsd-listen at fabiankeil dot de)	2011-07-14 21:00:26 +00:00
jhb	83fca1d193	Move {amd64,i386}/pci/pci_bus.c and {amd64,i386}/include/pci_cfgreg.h to the x86 tree. The $PIR code is still only enabled on i386 and not amd64. While here, make the qpi(4) driver on conditional on 'device pci'.	2011-06-22 21:04:13 +00:00
jkim	6da60ac39e	Set negative quality to TSC timecounter when C3 state is enabled for Intel processors unless the invariant TSC bit of CPUID is set. Intel processors may stop incrementing TSC when DPSLP# pin is asserted, according to Intel processor manuals, i. e., TSC timecounter is useless if the processor can enter deep sleep state (C3/C4). This problem was accidentally uncovered by r222869, which increased timecounter quality of P-state invariant TSC, e.g., for Core2 Duo T5870 (Family 6, Model f) and Atom N270 (Family 6, Model 1c). Reported by: Fabian Keil (freebsd-listen at fabiankeil dot de) Ian FREISLICH (ianf at clue dot co dot za) Tested by: Fabian Keil (freebsd-listen at fabiankeil dot de) - Core2 Duo T5870 (C3 state available/enabled) jkim - Xeon X5150 (C3 state unavailable)	2011-06-22 16:40:45 +00:00
jkim	8a9fdbb838	Teach the compiler how to shift TSC value efficiently. As noted in r220631, some times compiler inserts redundant instructions to preserve unused upper 32 bits even when it is casted to a 32-bit value. Unfortunately, it seems the problem becomes more serious when it is shifted, especially on amd64.	2011-06-17 21:41:06 +00:00
jkim	9c58536b52	Tidy up r222866. - Re-add accidentally removed atomic op. for sysctl(9) handler. - Remove a period(`.') at the end of a debugging message. - Consistently spell "low" for "TSC-low" timecounter throughout. Pointed out by: bde	2011-06-08 23:44:59 +00:00
jkim	9f1a70eb73	Increase quality of TSC (or TSC-low) timecounter to 1000 if it is P-state invariant. For SMP case (TSC-low), it also has to pass SMP synchronization test and the CPU vendor/model has to be white-listed explicitly. Currently, all Intel CPUs and single-socket AMD Family 15h processors are listed here. Discussed with: hackers	2011-06-08 20:08:06 +00:00
jkim	faed140c1e	Introduce low-resolution TSC timecounter "TSC-low". It replaces the normal TSC timecounter if TSC frequency is higher than ~4.29 MHz (or 2^32-1 Hz) or multiple CPUs are present. The "TSC-low" frequency is always lower than a preset maximum value and derived from TSC frequency (by being halved until it becomes lower than the maximum). Note the maximum value for SMP case is significantly lower than UP case because we want to reduce (rare but known) "temporal anomalies" caused by non-serialized RDTSC instruction. Normally, it is still higher than "ACPI-fast" timecounter frequency (which was default timecounter hardware for long time until r222222) to be useful.	2011-06-08 19:38:31 +00:00
jkim	16bd333059	Remove a redundant assignment since r221703.	2011-06-08 18:52:42 +00:00
attilio	d7cb9e4814	MFC	2011-05-09 18:53:13 +00:00
attilio	a0b51ba62f	MFC	2011-05-06 22:45:33 +00:00
attilio	fe4de567b5	Commit the support for removing cpumask_t and replacing it directly with cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno	2011-05-05 14:39:14 +00:00
attilio	b29cc3952a	MFC	2011-05-03 18:57:46 +00:00
jkim	44b010200b	Fix build with clang. Please note there is an LLVM/Clang PR: http://llvm.org/bugs/show_bug.cgi?id=9379 Reported by: rpaulo, dim	2011-05-02 17:08:36 +00:00
jhb	3e97a80649	Add implementations of BUS_ADJUST_RESOURCE() to the PCI bus driver, generic PCI-PCI bridge driver, x86 nexus driver, and x86 Host to PCI bridge drivers.	2011-05-02 14:13:12 +00:00
jhb	08955ceac0	Change rman_manage_region() to actually honor the rm_start and rm_end constraints on the rman and reject attempts to manage a region that is out of range. - Fix various places that set rm_end incorrectly (to ~0 or ~0u instead of ~0ul). - To preserve existing behavior, change rman_init() to set rm_start and rm_end to allow managing the full range (0 to ~0ul) if they are not set by the caller when rman_init() is called.	2011-04-29 18:41:21 +00:00
jkim	3b56923bc2	Detect VMware guest and set the TSC frequency as reported by the hypervisor. VMware products virtualize TSC and it run at fixed frequency in so-called "apparent time". Although virtualized i8254 also runs in apparent time, TSC calibration always gives slightly off frequency because of the complicated timer emulation and lost-tick correction mechanism.	2011-04-29 18:20:12 +00:00
jkim	6bf8d645b9	Turn off periodic recalibration of CPU ticker frequency if it is invariant.	2011-04-28 17:56:02 +00:00
attilio	d685681d59	Add the watchdogs patting during the (shutdown time) disk syncing and disk dumping. With the option SW_WATCHDOG on, these operations are doomed to let watchdog fire, fi they take too long. I implemented the stubs this way because I really want wdog_kern_* KPI to not be dependant by SW_WATCHDOG being on (and really, the option only enables watchdog activation in hardclock) and also avoid to call them when not necessary (avoiding not-volountary watchdog activations). Sponsored by: Sandvine Incorporated Discussed with: emaste, des MFC after: 2 weeks	2011-04-28 16:02:05 +00:00
jkim	8e677ed825	Use ACPI-supplied CPU frequencies instead of estimated ones as we are about to use other values from the same table anyway. MFC after: 3 days	2011-04-27 00:32:35 +00:00
jkim	adbae7d2c9	Use newly added rdtsc32() for DELAY(9) as well.	2011-04-14 19:11:45 +00:00
jkim	38dc7c42e6	Work around an emulator problem where virtual CPU advertises TSC is P-state invariant and APERF/MPERF MSRs exist but these MSRs never tick. When we calculate effective frequency from cpu_est_clockrate(), it caused panic of division-by-zero. Now we test whether these MSRs actually increase to avoid such foot-shooting. Reported by: dim Tested by: dim	2011-04-14 17:50:26 +00:00
jkim	c55a9d790a	Use newly added rdtsc32() for the timecounter_get_t method.	2011-04-14 17:08:23 +00:00
jkim	5b73ac45d1	Add some tunable descriptions about x86 timers. Requested by: arundel	2011-04-14 00:07:08 +00:00
jkim	2092a06579	Do not use TSC for DELAY(9) if it not P-state invariant to avoid possible foot-shooting. DELAY() becomes unreliable when TSC frequency varies wildly, especially cpufreq(4) and powerd(8) are used at the same time.	2011-04-12 22:41:52 +00:00
jkim	8eb15cd79a	Probe capability to find effective frequency. When the TSC is P-state invariant, APERF/MPERF ratio can be used to find effective frequency.	2011-04-12 22:15:46 +00:00
jkim	76647eca4d	Add a new tunable 'machdep.disable_tsc_calibration' to allow skipping TSC frequency calibration. For Intel processors, if brand string from CPUID contains its nominal frequency, this frequency is used instead.	2011-04-12 21:08:34 +00:00
jkim	61582b7c03	Merge two similar functions to reduce duplication.	2011-04-11 19:27:44 +00:00
jkim	096c7a804f	Refactor DELAYDEBUG as it is only useful for correcting i8254 frequency.	2011-04-08 19:54:29 +00:00
jkim	95c723445e	Use atomic load & store for TSC frequency. It may be overkill for amd64 but safer for i386 because it can be easily over 4 GHz now. More worse, it can be easily changed by user with 'machdep.tsc_freq' tunable (directly) or cpufreq(4) (indirectly). Note it is intentionally not used in performance critical paths to avoid performance regression (but we should, in theory). Alternatively, we may add "virtual TSC" with lower frequency if maximum frequency overflows 32 bits (and ignore possible incoherency as we do now).	2011-04-07 23:28:28 +00:00
jkim	9c28b443cb	Revert r219676. Requested by: jhb, bde	2011-03-16 16:44:08 +00:00
jkim	2c6f3a8cd1	Do not let machdep.tsc_freq modify tsc_freq itself. It is bad for i386 as it does not operate atomically. Actually, it serves no purpose. Noticed by: bde	2011-03-15 19:47:20 +00:00
jkim	ad8ef5e4c7	Deprecate tsc_present as the last of its real consumers finally disappeared.	2011-03-15 17:19:52 +00:00
jkim	36e15e1609	When TSC is unavailable, broken or disabled and the current timecounter has better quality than i8254 timer, use it for DELAY(9).	2011-03-14 22:05:59 +00:00
jkim	7df55dcdeb	Add a tunable "machdep.disable_tsc" to turn off TSC. Specifically, it turns off boot-time CPU frequency calibration, DELAY(9) with TSC, and using TSC as a CPU ticker. Note tsc_present does not change by this tunable.	2011-03-11 00:44:32 +00:00
jkim	a52b39f6a4	Turn off pointless P-state invariant TSC detection based on CPU model on a virtual machine.	2011-03-10 23:06:13 +00:00
jkim	98d68ca741	Deprecate rarely used tsc_is_broken. Instead, we zero out tsc_freq because it is almost always used with tsc_freq any way.	2011-03-10 20:02:58 +00:00
jkim	a0eded8271	Set C1 "I/O then Halt" capability bit for Intel EIST. Some broken BIOSes refuse to load external SSDTs if this bit is unset for _PDC. It seems Linux and OpenSolaris did the same long ago. MFC after: 1 week	2011-02-25 23:14:24 +00:00
brucec	4a353c54fd	Fix typos - remove duplicate "is". PR: docs/154934 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days	2011-02-23 09:22:33 +00:00
jhb	5cec5b65a5	Use a dedicated taskqueue with a thread that runs at a software-interrupt priority for the periodic polling of the machine check registers.	2011-02-03 13:09:22 +00:00
mdf	6b5f615b7c	Introduce signed and unsigned version of CTLTYPE_QUAD, renaming existing uses. Rename sysctl_handle_quad() to sysctl_handle_64().	2011-01-19 23:00:25 +00:00

1 2 3

130 Commits