Also disable a couple of ACPI devices that are not usable under Dom0.
To this end a couple of booleans are added that allow disabling ACPI
specific devices.
Sponsored by: Citrix Systems R&D
Reviewed by: jhb
x86/xen/xen_nexus.c:
- Return BUS_PROBE_SPECIFIC in the Xen Nexus attachement routine to
force the usage of the Xen Nexus.
- Attach the ACPI bus when running as Dom0.
dev/acpica/acpi_cpu.c:
dev/acpica/acpi_hpet.c:
dev/acpica/acpi_timer.c
- Add a variable that gates the addition of the devices.
x86/include/init.h:
- Declare variables that control the attachment of ACPI cpu, hpet and
timer devices.
Xen PVHVM guest.
Submitted by: Roger Pau Monné
Sponsored by: Citrix Systems R&D
Reviewed by: gibbs
Approved by: re (blanket Xen)
MFC after: 2 weeks
sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
- Make sure that are no MMU related IPIs pending on migration.
- Reset pending IPI_BITMAP on resume.
- Init vcpu_info on resume.
sys/amd64/include/intr_machdep.h:
sys/i386/include/intr_machdep.h:
sys/x86/acpica/acpi_wakeup.c:
sys/x86/x86/intr_machdep.c:
sys/x86/isa/atpic.c:
sys/x86/x86/io_apic.c:
sys/x86/x86/local_apic.c:
- Add a "suspend_cancelled" parameter to pic_resume(). For the
Xen PIC, restoration of interrupt services differs between
the aborted suspend and normal resume cases, so we must provide
this information.
sys/dev/acpica/acpi_timer.c:
sys/dev/xen/timer/timer.c:
sys/timetc.h:
- Don't swap out "suspend safe" timers across a suspend/resume
cycle. This includes the Xen PV and ACPI timers.
sys/dev/xen/control/control.c:
- Perform proper suspend/resume process for PVHVM:
- Suspend all APs before going into suspension, this allows us
to reset the vcpu_info on resume for each AP.
- Reset shared info page and callback on resume.
sys/dev/xen/timer/timer.c:
- Implement suspend/resume support for the PV timer. Since FreeBSD
doesn't perform a per-cpu resume of the timer, we need to call
smp_rendezvous in order to correctly resume the timer on each CPU.
sys/dev/xen/xenpci/xenpci.c:
- Don't reset the PCI interrupt on each suspend/resume.
sys/kern/subr_smp.c:
- When suspending a PVHVM domain make sure there are no MMU IPIs
in-flight, or we will get a lockup on resume due to the fact that
pending event channels are not carried over on migration.
- Implement a generic version of restart_cpus that can be used by
suspended and stopped cpus.
sys/x86/xen/hvm.c:
- Implement resume support for the hypercall page and shared info.
- Clear vcpu_info so it can be reset by APs when resuming from
suspension.
sys/dev/xen/xenpci/xenpci.c:
sys/x86/xen/hvm.c:
sys/x86/xen/xen_intr.c:
- Support UP kernel configurations.
sys/x86/xen/xen_intr.c:
- Properly rebind per-cpus VIRQs and IPIs on resume.
- Increase probing order for ECDT table to match HID-based probing.
- Decrease probing order for HPET table to match HID-based probing.
- Decrease probing order for CPUs and system resources.
- Fix ACPI_DEV_BASE_ORDER to reflect the reality.
quality to 950. HPET on modern platforms usually have better resolution and
lower latency than ACPI timer. Effectively this changes default timecounter
hardware from ACPI-fast to HPET by default when both are available.
Discussed with: avg
show that there are perfectly working PM timers with occasional "hiccups",
probably because of an SMI. Now we ignore the maximum if it happens once in
the test loop and the width is small enough. Also, relax normal width a bit
to count in a boundary case.
on the fact that real hardware has almost fixed cost to read the ACPI timer.
It is virtually always false for hardware emulation and it makes no sense to
read it multiple times, which is already quite expensive for full emulation.
the fast or safe/slow method is in use. Fast remains at 1000, slow is
now at 850 (always preferred to TSC). Since the HPET has proven slower
than ACPI-fast on some systems, drop its quality to 900. In the future,
it is hoped that HPET performance will improve as it is the main
timer Intel supports. HPET may move back to 2000 in -current once RELENG_7
is branched to ensure that it gets tested.
Approved by: re
sysctl_handle_int is not sizeof the int type you want to export.
The type must always be an int or an unsigned int.
Remove the instances where a sizeof(variable) is passed to stop
people accidently cut and pasting these examples.
In a few places this was sysctl_handle_int was being used on 64 bit
types, which would truncate the value to be exported. In these
cases use sysctl_handle_quad to export them and change the format
to Q so that sysctl(1) can still print them.
with acpi but the timer runs twice as fast. Note that the main problem
(system doesn't work properly with acpi disabled) should be fixed separately.
Changes:
* Add a quirk to disable the timer
* Merge the P5A and P5A-B quirks since they appear to be based on the
same ASL.
PR: i386/72450
Tested by: Kevin Oberman <oberman es.net>
MFC after: 3 days
added an arbitrary delay to our readings, causing us to use the ACPI-safe
read method when not necessary. Submitted by: bde
Old:
ACPI timer looks GOOD min = 3, max = 5, width = 2
ACPI timer looks BAD min = 3, max = 19, width = 16
ACPI timer looks GOOD min = 3, max = 5, width = 2
ACPI timer looks GOOD min = 3, max = 5, width = 2
ACPI timer looks GOOD min = 3, max = 5, width = 2
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 5, width = 2
ACPI timer looks BAD min = 3, max = 19, width = 16
ACPI timer looks GOOD min = 3, max = 5, width = 2
ACPI timer looks GOOD min = 3, max = 4, width = 1
Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000
New:
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
ACPI timer looks GOOD min = 3, max = 4, width = 1
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
Also, reduce unnecesary overhead in ACPI-fast by remove the barrier for
reads. The timer in the ACPI-fast case is known to increase monotonically
so there is no need to serialize access to it.
workaround was for hardware where the clock was not latched, not for
hardware that was too slow. Also, make variable names more specific for ddb
printing.
what the ACPI-safe workaround is intended to fix. Requested by phk.
Set the bushandle and tag when attaching the timer, don't do it each time
in read_counter(). Pointed out by bde.
Move test_counter() to the end. Staticize acpi_timer_reg.
supported. Symptoms of this bug included unnecessary use of ACPI-safe
and a dmesg that has deltas of about 2^24:
ACPI timer looks BAD min = 2, max = 16777206, width = 16777204
ACPI timer looks BAD min = 2, max = 7, width = 5
ACPI timer looks GOOD min = 4, max = 5, width = 1
ACPI timer looks BAD min = 2, max = 16777206, width = 16777204
ACPI timer looks BAD min = 2, max = 7, width = 5
ACPI timer looks BAD min = 2, max = 16777210, width = 16777208
ACPI timer looks BAD min = 4, max = 16777189, width = 16777185
ACPI timer looks GOOD min = 4, max = 5, width = 1
ACPI timer looks BAD min = 2, max = 7, width = 5
ACPI timer looks BAD min = 4, max = 16777189, width = 16777185
To fix this:
* Use a 32 bit timecounter mask when the timer is 32 bits.
* In test_counter(), use the acpi_TimerDelta function which handles 24/32
bit timers and wraparound.
Miscellaneous fixes:
* Use C99 initializers for timecounter struct.
* Use u_int and uint32_t where appropriate instead of unsigned.
* Remove whitespace-only lines
* Remove the old PIIX4 PCI workaround. The timecounter testing code has
been in use for long enough to prove it's functional.
Sort acpi debug values. Change "disable" to "disabled" to match rest of
the kernel. Remove debugging from acpi_toshiba since it was only used for
probe/attach.
A timecounter will be selected when registered if its quality is
not negative and no less than the current timecounters.
Add a sysctl to report all available timecounters and their qualities.
Give the dummy timecounter a solid negative quality of minus a million.
Give the i8254 zero and the ACPI 1000.
The TSC gets 800, unless APM or SMP forces it negative.
Other timecounters default to zero quality and thereby retain current
selection behaviour.
at all (ie reads yield constant values). Display the width as the
difference between max and min so that constant timers have width
zero.
o Get the address of the timer from the XPmTmrBlk field instead of
the V1_PmTmrBlk field. The former is a generic address and can
specify a memory mapped I/O address. Remove <machine/bus_pio.h>
to account for this. The timer is now properly configured on
machines with ACPI v2 tables, whether PIO or MEMIO. Note that
the acpica code converts v1 tables into v2 tables so the address
is always present in XPmTmrBlk.
o Replace the TIMER_READ macro with a call to the read_counter()
function and add a barrier to make sure that we observe proper
ordering of the reads.
timecounter will be used starting at the next second, which is
good enough for sysctl purposes. If better adjustment is needed
the NTP PLL should be used.
environment needed at boot time to a dynamic subsystem when VM is
up. The dynamic kernel environment is protected by an sx lock.
This adds some new functions to manipulate the kernel environment :
freeenv(), setenv(), unsetenv() and testenv(). freeenv() has to be
called after every getenv() when you have finished using the string.
testenv() only tests if an environment variable is present, and
doesn't require a freeenv() call. setenv() and unsetenv() are self
explanatory.
The kenv(2) syscall exports these new functionalities to userland,
mainly for kenv(1).
Reviewed by: peter
The conclusion is that this method really can tell the perfect from the
less than perfect ACPI counters.
It is in fact probably a bit more discriminative than that, but we
will rather condemn some otherwise perfect counters to the slightly
slower "-safe" version, than certify a counter as perfect which
will let us down later.
Many thanks to all the people who sent email reports!
the inter-value histogram for 2000 samples. If the width is 3 or less
for 10 consequtive samples, we trust the counter to be good, otherwise
we use the *_safe() method.
This method may be too strict, but the worst which can happen is that
we take the performance hit of the *_safe() method when we should not.
Make the *_safe() method more discriminating by mandating that the three
samples do not span more than 15 ticks on the counter.
Disable the PCI-ident based probing as a means to recognize good
counters.
Inspiration from: dillon and msmith
latch the acpi timer, resulting in weird deltas. The problem is severe
enough to adversely effect the timecounter code.
Default to the 'safe' version of the get-timecount function. The probe
will override it if a known-good chipset is found. This is temporary
until a more complete solution is found.
Reviewed by: phk
- Remove the beer-ware license (reqested by phk)
- Reorganise so that the PIIX4 workaround code is kept together, and
switch the workaround function via the timecounter struct, saving
a compare in the read-timecounter codepath. Also indicate that
the workaround is active by changing the timecounter hardware string.