are still bound to BSP. It confuses timer management logic in per-CPU mode
and may cause timer not being reloaded. Check such cases on interrupt
arival and reload timer to give system some more time to manage proper
binding.
- Add special check for case when time expires before being programmed.
This fixes interrupt loss and respectively timer death on attempt to
program very short interval. Increase minimal supported period to more
realistic value.
- Add support for hint.hpet.X.allowed_irqs tunable, allowing manually
specify which interrupts driver allowed to use. Unluckily, many BIOSes
program wrong allowed interrupts mask, so driver tries to stay on safe
side by not using unshareable ISA IRQs. This option gives control over
this limitation, allowing more per-CPU timers to be provided, when FSB
interrupts are not supported. Value of this tunable is bitmask.
- Do not use regular interrupts on virtual machines. QEMU and VirtualBox
do not support them properly, that may cause problems. Stay safe by default.
Same time both QEMU and VirtualBox work fine in legacy_route mode.
VirtualBox also works fine if manually specify allowed ISA IRQs with above.
method is used by the PCI bus driver to query the power management system
to determine the proper device state to be used for a device during suspend
and resume. For the ACPI PCI bridge drivers this calls
acpi_device_pwr_for_sleep(). This removes ACPI-specific knowledge from
the PCI and PCI-PCI bridge drivers.
Reviewed by: jkim
bus_generic_resume() since the pci_link(4) driver was added.
- Change the ACPI PCI-PCI bridge driver to inherit most of its methods
from the generic PCI-PCI bridge driver. In particular, this will now
restore PCI config registers for ACPI PCI-PCI bridges.
Tested by: Oleg Sharoyko osharoiko of gmail
lengths. Make MI wrapper code to validate periods in request. Make kernel
clock management code to honor these hardware limitations while choosing hz,
stathz and profhz values.
AcpiOsMapMemory()/AcpiOsUnmapMemory() (-> pmap_mapbios()/pmap_unmapbios())
for AcpiOsReadMemory() and AcpiOsWriteMemory(). Although they do not sound
too obvious, these functions are exclusively used to access memory mapped
IO in ACPICA.
According to ACPICA User Guide and Programmer Reference, the read data must
be zero extended to fill the 64-bit return value even if the bit width of
the location is less than 64.
- Return error when 64-bit access is requested as we do not support 64-bit
PCI register access (yet). XXX We may have to split it up into two 32-bit
accesses if it is really required.
According to ACPICA User Guide and Programmer Reference, the read data must
be zero extended to fill the 32-bit return value even if the bit width of
the port is less than 32.
- Remove 64-bit read/write from AcpiOsReadMemory() and AcpiOsWriteMemory().
These functions do not support 64-bit access (yet). Clean up style nits
and unnecessary bit masking while I am here.
Reported by: Liu, Jinsong (jinsong dot liu at intel dot com) via
Lin Ming (ming dot m dot lin at intel dot com) [1]
HPET to steal IRQ0 from i8254 and IRQ8 from RTC timers. It can be suitable
for HPETs without FSB interrupts support, as it gives them two unshared
IRQs. It allows them to provide one per-CPU event timer on dual-CPU system,
that should be suitable for further tickless kernels.
To enable it, such lines may be added to /boot/loader.conf:
hint.atrtc.0.clock=0
hint.attimer.0.clock=0
hint.hpet.0.legacy_route=1
writing event timer drivers, for choosing best possible drivers by machine
independent code and for operating them to supply kernel with hardclock(),
statclock() and profclock() events in unified fashion on various hardware.
Infrastructure provides support for both per-CPU (independent for every CPU
core) and global timers in periodic and one-shot modes. MI management code
at this moment uses only periodic mode, but one-shot mode use planned for
later, as part of tickless kernel project.
For this moment infrastructure used on i386 and amd64 architectures. Other
archs are welcome to follow, while their current operation should not be
affected.
This patch updates existing drivers (i8254, RTC and LAPIC) for the new
order, and adds event timers support into the HPET driver. These drivers
have different capabilities:
LAPIC - per-CPU timer, supports periodic and one-shot operation, may
freeze in C3 state, calibrated on first use, so may be not exactly precise.
HPET - depending on hardware can work as per-CPU or global, supports
periodic and one-shot operation, usually provides several event timers.
i8254 - global, limited to periodic mode, because same hardware used also
as time counter.
RTC - global, supports only periodic mode, set of frequencies in Hz
limited by powers of 2.
Depending on hardware capabilities, drivers preferred in following orders,
either LAPIC, HPETs, i8254, RTC or HPETs, LAPIC, i8254, RTC.
User may explicitly specify wanted timers via loader tunables or sysctls:
kern.eventtimer.timer1 and kern.eventtimer.timer2.
If requested driver is unavailable or unoperational, system will try to
replace it. If no more timers available or "NONE" specified for second,
system will operate using only one timer, multiplying it's frequency by few
times and uing respective dividers to honor hz, stathz and profhz values,
set during initial setup.
measured interval as upper bound. It should be more precise then just
assuming hz/2. For idle CPU it should be quite precise, for busy - not
worse then before.
state lower than the lowest one supported by the current CPU. This closes
some races with changes to the hw.acpi.cpu_cx_lowest sysctl while Cx
states for individual CPUs were changing (e.g. unplugging the AC adapter
of a laptop) that could result in panics.
Submitted by: Giovanni Trematerra
Tested by: David Demelier demelier dot david of gmail
MFC after: 3 days
via %s
Most of the cases looked harmless, but this is done for the sake of
correctness. In one case it even allowed to drop an intermediate buffer.
Found by: clang
MFC after: 2 week
device, make sure we have no real HPET device entry with same ID.
As side effect, it potentially allows several HPETs to be attached.
Use first of them for timecounting, rest (if ever present) could later
be used as event sources.
Setting the new sysctl MIB "debug.acpi.enable_debug_objects" to a non-zero
value enables us to print Debug object when something is written to it.
- Allow users to disable interpreter slack mode. Setting the new tunable
"debug.acpi.interpreter_slack" to zero disables some workarounds for common
BIOS mistakes and enables strict ACPI implementations by the specification.
It is belived that that pass s not needed anymore.
Specifically it is not required now for the reasons that were given
in the removed comment.
Discussed with: jhb
MFC after: 4 weeks
Some current systems dynamically load SSDT(s) when _PDC/_OSC method
of Processor is evaluated. Other devices in ACPI namespace may access
objects defined in the dynamic SSDT. Drivers for such devices might
have to have a rather high priority, because of other dependencies.
Good example is acpi_ec driver for EC.
Thus we attach to Processors as early as possible to load the SSDTs
before any other drivers may try to evaluate control methods.
It also seems to be a natural order for a processor in a device
hierarchy.
On the other hand, some child devices on acpi cpu bus need to access
other system resources like PCI configuration space of chipset devices,
so they need to be probed and attached rather late.
For this reason we probe and attach the cpu bus at
SI_SUB_CONFIGURE:SI_ORDER_MIDDLE SYSINIT level.
In the future this could be done more elegantly via multipass.
Please note that acpi drivers that might access ACPI namespace from
device_identify will do that before _PDC/_OSC of Processors are evaluated.
Legacy cpu driver is not affected by this change.
PR: kern/142561 (in part)
Reviewed by: jhb
Silence from: acpi@
MFC after: 5 weeks
_PDC was deprecated in favor of _OSC long time ago, but it
seems that they still peacefully coexist and in some case
only _PDC is present.
Still _OSC provides a reacher interface and is capable to
report back its status.
If the status is non-zero, then report it, we may find
it useful to understand what firmware expects from OS.
Also clean up some comments that became less useful over time.
Reviewed by: njl, jhb, rpaulo
MFC after: 3 weeks
Also, account for a quirk of AMD/ATI HPET which reports number of timers
instead of id of the last timer as manadated by the specification.
Currently this has no effect on functionality but in the future we may
make actual use of the HPET timers, not only of its timecounter.
MFC after: 2 weeks
This is not only a prudent thing to do, but also makes sure that probe
method is not confused by non-NULL 'private', if the previous attach
attempt fails for any reason.
PR: kern/142561
Tested by: Alex Goncharov <alex-goncharov@comcast.net>
MFC after: 4 days
o acpi_hpet: auto-added 'wildcard' devices can be identified by
non-NULL handle attribute.
o acpi_ec: auto-add 'wildcard' devices can be identified by
unset (NULL) private attribute.
o acpi_cpu: use private instead of magic to store cpu id.
Reviewed by: jhb
Silence from: acpi@
MFC after: 2 weeks
X-MFC-Note: perhaps the ivar should stay for ABI stability
sysctl lock. The 'video' lock now protects the 'bus' of video output
devices attached to a graphics adapter. It is used when iterating over
the list of outputs, etc. The 'video_output' lock is used to lock the
output-specific data similar to a driver lock for the individual video
outputs.
MFC after: 2 weeks
startup and genericize it so it can be reused to map other tables as well:
- Add a routine to walk a list of ACPI subtables such as those used in the
APIC and SRAT tables in the MI acpi(4) driver.
- Move the routines for mapping and unmapping an ACPI table as well as
mapping the RSDT or XSDT and searching for a table with a given signature
out into acpica_machdep.c for both amd64 and i386.
BIOS-enumerated devices:
- Assume a device is a match if the memory and I/O ports match even if the
IRQ or DRQ is wrong or missing. Some BIOSes don't include an IRQ for
the atrtc device for example.
- Add a hack to better match floppy controller devices. Many BIOSes do not
include the starting port of the floppy controller listed in the hints
(0x3f0) in the resources for the device. So far, however, all the BIOS
variations encountered do include the 'port + 2' resource (0x3f2), so
adjust the matching for "fdc" devices to look for 'port + 2'.
Reviewed by: imp
MFC after: 3 days
The newbus lock is responsible for protecting newbus internIal structures,
device states and devclass flags. It is necessary to hold it when all
such datas are accessed. For the other operations, softc locking should
ensure enough protection to avoid races.
Newbus lock is automatically held when virtual operations on the device
and bus are invoked when loading the driver or when the suspend/resume
take place. For other 'spourious' operations trying to access/modify
the newbus topology, newbus lock needs to be automatically acquired and
dropped.
For the moment Giant is also acquired in some key point (modules subsystem)
in order to avoid problems before the 8.0 release as module handlers could
make assumptions about it. This Giant locking should go just after
the release happens.
Please keep in mind that the public interface can be expanded in order
to provide more support, if there are really necessities at some point
and also some bugs could arise as long as the patch needs a bit of
further testing.
Bump __FreeBSD_version in order to reflect the newbus lock introduction.
Reviewed by: ed, hps, jhb, imp, mav, scottl
No answer by: ariff, thompsa, yongari
Tested by: pho,
G. Trematerra <giovanni dot trematerra at gmail dot com>,
Brandon Gooch <jamesbrandongooch at gmail dot com>
Sponsored by: Yahoo! Incorporated
Approved by: re (ksmith)
- Preallocate some memory for ACPI tasks early enough. We cannot use
malloc(9) any more because spin mutex may be held here. The reserved
memory can be tuned via debug.acpi.max_tasks tunable or ACPI_MAX_TASKS
in kernel configuration. The default is 32 tasks.
- Implement a custom taskqueue_fast to wrap the new memory allocation.
This implementation is not the fastest in the world but we are being
conservative here.
a _BBN value of 0 if it was for the first bridge encountered since some
older systems returned _BBN of 0 for all bridges. However, some newer
systems enumerate bridges with non-zero _BBN before bus 0 which is
perfectly valid. Handle both cases by trusting the first bridge that has
a _BBN of 0 and falling back to reading from non-standard config registers
only for subsequent bridges with a _BBN of 0. We also only perform this
check for segment (domain) 0. We assume that _BBN is always correct
for segments other than 0.
Tested by: Josef Moellers josef.moellers at fujitsu
MFC after: 1 week
leading to a bug, when C-state does not decrease on sleep shorter then
declared transition latency. Fixing this deprecates workaround for broken
C-states on some hardware.
By the way, change state selecting logic a bit. Instead of last sleep
time use short-time average of it. Global interrupts rate in system is a
quite random value, to corellate subsequent sleeps so directly.
- Probe supported sleep states from acpi_attach() just once and do not
call AcpiGetSleepTypeData() again. It is redundant because
AcpiEnterSleepStatePrep() does it any way.
- Treat UNKNOWN sleep state as NONE, i.e., "do nothing", and remove obscure
NONE state (ACPI_S_STATES_MAX + 1) to avoid confusions.
- Do not set unsupported sleep states as default button/switch events.
If the default sleep state is not supported, just set it as UNKNOWN/NONE.
- Do not allow sleep state change if the system is not fully up and running.
This should prevent entering S5 state multiple times, which causes strange
behaviours later.
- Make sleep states case-insensitive when they are used with sysctl(8).
For example,
sysctl hw.acpi.lid_switch_state=s1
sysctl hw.acpi.sleep_button_state=none
are now legal and equivalent to the uppercase ones.
This change adds (possibly redundant) early check for invalid
state input parameter (including S0). Handling of S5 request
is reduced to simply calling shutdown_nice(). As a result
control flow of acpi_EnterSleepState is somewhat simplified
and resume/backout half of the function is not executed
for S5 (soft poweroff) request and invalid state requests.
Note: it seems that shutdown_nice may act as nop when initproc
is already initialized (to grab pid of 1), but init process is in
"pre-natal" state.
Tested by: Fabian Keil <fk@fabiankeil.de>
Reviewed by: njl, jkim
Approved by: rpaulo
into acpi_cpu_startup() which is where all the other code to update this
global variable lives. This fixes a bug where cpu_cx_count was not updated
correctly if acpi_cpu_generic_cx_probe() returned early.
PR: kern/108581
Debugged by: Bruce Cran
Reviewed by: avg, njl, sepotvin
MFC after: 3 days
This code is heavily inspired by Takanori Watanabe's experimental SMP patch
for i386 and large portion was shamelessly cut and pasted from Peter Wemm's
AP boot code.
This is triggered only if BIOS configures ACPI_BITREG_BUS_MASTER_RLD
aka BRLD_EN_BM to 1.
Rationale:
1. we do not support C3 on PIIX4E
2. bus master activity need not break out of C2 state
3. because of CPU_QUIRK_NO_BM_CTRL quirk we may reset bus master
status which would result in immediate break out from C2
So if you have seen
cpu0: too many short sleeps, backing off to C1
with this chipset before you may want to try cx_lowest of C2 again.
Reviewed by: rpaulo (mentor), njl
Approved by: rpaulo (mentor)
if (batt_sleep_ms)
AcpiOsSleep(1);
where the rest are all:
if (batt_sleep_ms)
AcpiOsSleep(batt_sleep_ms);
I can't recall why that one was different, so change it
to match the rest.
Pointed out by: Christoph Mallon
MFC after: 2 weeks
On some laptops with smart batteries, enabling battery monitoring
software causes keystrokes from atkbd to be lost. This has also been
reported on Linux, and is apparently due to the keyboard and I2C line
for the battery being routed through the same chip. Whether that's
accurate or not, adding extra sleeps to the status checking code
causes the problem to go away.
I've been running this for nearly six months now on my laptop,
it works like a charm.
Reviewed by: Nate Lawson (in a previous revision)
MFC after: 2 weeks
- An "at" hint now reserves a device name.
- A new BUS_HINT_DEVICE_UNIT method is added to the bus interface. When
determining the unit number of a device, this method is invoked to
let the bus driver specify the unit of a device given a specific
devclass. This is the only way a device can be given a name reserved
via an "at" hint.
- Implement BUS_HINT_DEVICE_UNIT() for the acpi(4) and isa(4) bus drivers.
Both of these busses implement this by comparing the resources for a
given hint device with the resources enumerated by ACPI/PnPBIOS and
wire a unit if the hint resources are a subset of the "real" resources.
- Use bus_hinted_children() for adding hinted devices on isa(4) busses
now instead of doing it by hand.
- Remove the unit kludging from sio(4) as it is no longer necessary.
Prodding from: peter, imp
OK'd by: marcel
MFC after: 1 month
use process ID as ACPI thread ID. Concurrent requests with equal thread
IDs broke ACPI mutexes operation causing unpredictable errors including
AE_AML_MUTEX_NOT_ACQUIRED that I have seen.
Use kernel thread ID instead of process ID for ACPI thread.
- Rename pciereg_cfgopen() to pcie_cfgregopen() and expose it to the
rest of the kernel. It now also accepts parameters via function
arguments rather than global variables.
- Add a notion of minimum and maximum bus numbers and reject requests for
an out of range bus.
- Add more range checks on slot/func/reg/bytes parameters to the cfg reg
read/write routines. Don't panic on any invalid parameters, just fail
the request (writes do nothing, reads return -1). This matches the
behavior of the other cfg mechanisms.
- Port the memory mapped configuration space access to amd64. On amd64
we simply use the direct map (via pmap_mapdev()) for the memory mapped
window.
- During acpi_attach() just after loading the ACPI tables, check for a
MCFG table. If it exists, call pciereg_cfgopen() on each subtable
(memory mapped window). For now we only support windows for domain 0
that start with bus 0. This removes the need for more chipset-specific
quirks in the MD code.
- Remove the chipset-specific quirks for the Intel 5000P/V/Z chipsets
since these machines should all have MCFG tables via ACPI.
- Updated pci_cfgregopen() to DTRT if ACPI had invoked pcie_cfgregopen()
earlier.
MFC after: 2 weeks
behavior. Specifically, probe Host-PCI bridges in the order they are
encountered in the tree. For CPUs, just use an order of 100000 and assume
that no Host-PCI bridges will be more than 10000 levels deep in the
namespace. This fixes an issue on some boxes where the HPET timer stopped
attaching.
assumptions about the state of the cooling devices. Instead, switch them
off on init and, only after that, we are in TZ_ACTIVE_NONE.
Submited by: Andriy Gapon <avg at icyb.net.ua>
Reviewed by: njl
different "platforms" on x86 machines. The existing code already handles
having two platforms: ACPI and legacy. However, the existing approach was
rather hardcoded and difficult to extend. These changes take the approach
that each x86 hardware platform should provide its own nexus(4) driver (it
can inherit most of its behavior from the default legacy nexus(4) driver)
which is responsible for probing for the platform and performing
appropriate platform-specific setup during attach (such as adding a
platform-specific bus device). This does mean changing the x86 platform
busses to no longer use an identify routine for probing, but to move that
logic into their matching nexus(4) driver instead.
- Make the default nexus(4) driver in nexus.c on i386 and amd64 handle the
legacy platform. It's probe routine now returns BUS_PROBE_GENERIC so it
can be overriden.
- Expose a nexus_init_resources() routine which initializes the various
resource managers so that subclassed nexus(4) drivers can invoke it from
their attach routine.
- The legacy nexus(4) driver explicitly adds a legacy0 device in its
attach routine.
- The ACPI driver no longer contains an new-bus identify method. Instead
it exposes a public function (acpi_identify()) which is a probe routine
that the MD nexus(4) drivers can use to probe for ACPI. All of the
probe logic in acpi_probe() is now moved into acpi_identify() and
acpi_probe() is just a stub.
- On i386 and amd64, an ACPI-specific nexus(4) driver checks for ACPI via
acpi_identify() and claims the nexus0 device if the probe succeeds. It
then explicitly adds an acpi0 device in its attach routine.
- The legacy(4) driver no longer knows anything about the acpi0 device.
- On ia64 if acpi_identify() fails you basically end up with no devices.
This matches the previous behavior where the old acpi_identify() would
fail to add an acpi0 device again leaving you with no devices.
Discussed with: imp
Silence on: arch@
the cpufreq drivers to reliably use properties of PCI devices for quirks,
etc.
- For the legacy drivers, add CPU devices via an identify routine in the
CPU driver itself rather than in the legacy driver's attach routine.
- Add CPU devices after Host-PCI bridges in the acpi bus driver.
- Change the ichss(4) driver to use pci_find_bsf() to locate the ICH and
check its device ID rather than having a bogus PCI attachment that only
checked for the ID in probe and always failed. As a side effect, you
can now kldload ichss after boot.
- Fix the ichss(4) driver to use the correct device_t for the ICH (and not
for ichss0) when doing PCI config space operations to enable SpeedStep.
MFC after: 2 weeks
Reviewed by: njl, Andriy Gapon avg of icyb.net.ua
the appropriate bit in the DEVACTB register.
This change allows the C2 state on those systems to work as expected.
Reviewed by: njl
Submitted by: Andriy Gapon <avg at icyb.net.ua>
MFC after: 1 week
spec:
- Use read/modify/write cycles to enable and disable the HPET instead of
writing 0 to reserved bits.
- Shutdown the HPET during suspend as encouraged by the spec.
- Fail to attach to an HPET with a period of zero.
MFC after: 1 week
PR: kern/119675 [3]
Reported by: Leo Bicknell | bicknell ufp.org
zone code. The GPE handler method (i.e. _L00) generates various Notify
events that need to be run to completion before the GPE is re-enabled.
In ACPI-CA, we queue an asynch callback at the same priority as a Notify
so that it will only run after all Notify handlers have completed. The
callback re-enables the GPE afterwards. We also changed the priority of
Notifies to be the same as GPEs, given the possibility that another GPE
could arrive before the Notifies have completed and we don't want it to
get queued ahead of the rest.
The ACPI-CA change was submitted by Alexey Starikovskiy (SUSE) and will
appear in a later release. Special thanks to him for helping track this
bug down.
MFC after: 1 week
Tested by: jhb, Yousif Hassan <yousif / alumni.jmu.edu>
correct number of acpi_thermalX devices. Having this wrong caused the
acpi_thermal thread to realloc the array of devices on each loop iteration.
MFC after: 1 week
PR: kern/118497
Submitted by: Pasi Parviainen
for that argument. This will allow DDB to detect the broad category of
reason why the debugger has been entered, which it can use for the
purposes of deciding which DDB script to run.
Assign approximate why values to all current consumers of the
kdb_enter() interface.
CPUs to make sure idle threads are evicted from the softc before returning
from acpi_cpu_shutdown(). However, this is unnecessary since stop_cpus()
handles this for itself and at this point it's possible that our IPI will be
blocked (interrupts disabled).
Thanks to: Glen Leeder <glen.leeder / nokia.com>
MFC after: 3 days
reason (not all BIOSen have _DIS methods for all link devices for example).
This matches the behavior of attach() with respect to _DIS as well.
Submitted by: njl
handle to the PCI device_t if the ACPI device_t is already attached to a
driver. This happens on the Tablet TC1000 which for some reason includes
two PCI-ISA bridges and treats the second bridge as an ACPI system resource
device.
Reviewed by: njl (a while ago)
MFC after: 3 days
to kproc_xxx as they actually make whole processes.
Thos makes way for us to add REAL kthread_create() and friends
that actually make theads. it turns out that most of these
calls actually end up being moved back to the thread version
when it's added. but we need to make this cosmetic change first.
I'd LOVE to do this rename in 7.0 so that we can eventually MFC the
new kthread_xxx() calls.
voltage of 0. This can result in a divide by zero trap. Add a guard
for this case. The value of lfcap is checked in acpi_battery_bif_valid()
just before this, so it is safe.
Reportd by: sam
Approved by: re
MFC after: 3 days
of directly from acpi0. Before it would attach prior to the sysresource
devices, causing the later allocation of its memory range to fail and
print a warning like "acpi0: reservation of fed00000, 1000 (3) failed".
Use an explicit define for our probe order base value of 10.
Help from: jhb
Tested by: Abdullah Ibn Hamad Al-Marri <almarrie / gmail.com>
MFC after: 3 days
Approved by: re
support machines having multiple independently numbered PCI domains
and don't support reenumeration without ambiguity amongst the
devices as seen by the OS and represented by PCI location strings.
This includes introducing a function pci_find_dbsf(9) which works
like pci_find_bsf(9) but additionally takes a domain number argument
and limiting pci_find_bsf(9) to only search devices in domain 0 (the
only domain in single-domain systems). Bge(4) and ofw_pcibus(4) are
changed to use pci_find_dbsf(9) instead of pci_find_bsf(9) in order
to no longer report false positives when searching for siblings and
dupe devices in the same domain respectively.
Along with this change the sole host-PCI bridge driver converted to
actually make use of PCI domain support is uninorth(4), the others
continue to use domain 0 only for now and need to be converted as
appropriate later on.
Note that this means that the format of the location strings as used
by pciconf(8) has been changed and that consumers of <sys/pciio.h>
potentially need to be recompiled.
Suggested by: jhb
Reviewed by: grehan, jhb, marcel
Approved by: re (kensmith), jhb (PCI maintainer hat)
polling/interrupt-driven fallback and instead use polling only during
boot and pure interrupt-driven mode after boot. Polled mode could be
relegated completely to a legacy role if we could enable interrupts
during boot. Polled mode can be forced after boot by setting
debug.acpi.ec.polled="1", i.e. if there are timeouts.
- Use polling only during boot, shutdown, or if requested by the user.
Otherwise, use a generation count of GPEs, incremented atomically. This
prevents an old status value from being used if the EC is really slow
and the same condition (i.e. multiple IBEs for a write transaction) is
being checked.
- Check for and run the query handler directly if the SCI bit is set in
the status register during boot. Previously, the query handler wouldn't
run until interrupts were finally enabled late in boot.
- During boot and after starting a command, check if the event appears
to already have occurred before we even start waiting. If so, it's
possible the EC is very slow and we might accept an old status value.
Print a warning in this case. Once we've booted, interrupt-driven mode
should work just fine but polled mode could be unreliable. There's not
much more we can do about this until interrupts are enabled during boot.
- In the above case, we also do one final check if the interrupt-driven
mode gets a timeout. If the status is complete, it will force the
system back into polled mode since interrupt mode doesn't work. For
polled mode during boot, if the status appears to be already complete
before beginning the check loop, it waits 10 us before actually checking
the status, just in case the EC is really slow and hasn't gotten to work
on the new request yet.
- Use upper-case hex for the _Qxx method
- Use device_printf for errors, don't hide them under verbose
- Increase default total timeout to 750 ms and decrease polling interval
to 5 us.
- Don't pass the status value via the softc. Just read it directly.
- Remove the mutex. We use the sx lock for transaction serialization
with the query handler.
- Remove the Intel copyright notice as no code of theirs was ever
present in this file (verified against rev 1.1)
- Allow KTR module-only builds for ease of testing
Thanks to jkim and Alexey Starikovskiy for helpful discussions and testing.
Approved by: re
MFC after: 2 weeks
needed at least to convince the BIOS to give us access to CPU freq
control on MacBooks.
Submitted by: Rui Paulo <rpaulo / fnop.net>
Approved by: re
MFC after: 5 days
the fast or safe/slow method is in use. Fast remains at 1000, slow is
now at 850 (always preferred to TSC). Since the HPET has proven slower
than ACPI-fast on some systems, drop its quality to 900. In the future,
it is hoped that HPET performance will improve as it is the main
timer Intel supports. HPET may move back to 2000 in -current once RELENG_7
is branched to ensure that it gets tested.
Approved by: re
advancing. Read from the timer before attaching to be sure it advances
in 1 us. Since the slowest rate allowed by the spec is 10 MHz, the
timer is guaranteed to change in this interval if it is working.
Tested by: Rui Paulo
Approved by: re
MFC after: 3 days
switch (i.e. lid) is set to have an action of NONE. This is not an
invalid state, so silently return. This fixes the warning:
"acpi: request to enter state S6 failed (err 22)"
Approved by: re
Improvements:
* /etc/rc.suspend,rc.resume are always run, no matter the source of the
suspend request (user or kernel, apm or acpi)
* suspend now requires positive user acknowledgement. If a user program
wants to cancel the suspend, they can. If one of the user programs
hangs or doesn't respond within 10 seconds, the system suspends anyway.
* /dev/apm is clonable, allowing multiple listeners for suspend events.
In the future, xorg-server can use this to be informed about suspend
even if there are other listeners (i.e. apmd).
Changes:
* Two new ACPI ioctls: REQSLPSTATE and ACKSLPSTATE. Request begins the
process of suspending by notifying all listeners. acpi is monitored by
devd(8) and /dev/apm listener(s) are also counted. Users register their
approval or disapproval via Ack. If anyone disapproves, suspend is vetoed.
* Old user programs or kernel modules that used SETSLPSTATE continue to
work. A message is printed once that this interface is deprecated.
* acpiconf gains the -k flag to ack the suspend request. This flag is
undocumented on purpose since it's only used by /etc/rc.suspend. It is
not intended to be a permanent change and will be removed once a better
power API is implemented.
* S5 (power off) is no longer supported via acpiconf -s 5 or apm -z/-Z.
This restores previous behavior of halt/shutdown -p being the interface.
* Miscellaneous improvements to error reporting
Approved by: re
sysctl_handle_int is not sizeof the int type you want to export.
The type must always be an int or an unsigned int.
Remove the instances where a sizeof(variable) is passed to stop
people accidently cut and pasting these examples.
In a few places this was sysctl_handle_int was being used on 64 bit
types, which would truncate the value to be exported. In these
cases use sysctl_handle_quad to export them and change the format
to Q so that sysctl(1) can still print them.
While in the suspend path, this means the idle thread will just return
immediately rather than trying to enter C1-n. This helps in the case where
the chipset is powered down before the rest of the system and reads from
the cpu sleep registers begin returning immediately, causing the logic that
catches bad C2/C3 behavior to kick in. Observed on my Panasonic Y4.
MFC after: 3 days
(j/i) was being used and it was being incremented, not decremented as before.
Factor out this code into a common function and call it from both the common
and per-CPU case.
MFC after: 1 day
The global lock is a memory region shared with the BIOS and thus
has some strange behavior like the fact that the sleep is 1 ms max.
We use standard mutexes to synchronize with the SCI so acquiring
the global lock after locking the mutex resulted in a witness
warning.
To deal with this for now, acquire the global lock before all other
locks, similar to Giant. This should fix the witness "sleeping
with mutex held" issue on boot that occurred after the last ACPI-CA
import. In the future, we hope to move to the new mutex interface
in ACPI-CA instead of the pseudo-semaphore version we have now.
Reviewed by: jkim
back in a simulated resume instead of entering the requested suspend state.
This helps in testing drivers separately from the acpi suspend code. To
test your drivers, set debug.acpi.suspend_bounce=1 and then run
acpiconf -s3 (or 4).
MFC after: 1 day
- Simplify the amount of work that has be done for each architecture by
pushing more of the truly MI code down into the PCI bus driver.
- Don't bind MSI-X indicies to IRQs so that we can allow a driver to map
multiple MSI-X messages into a single IRQ when handling a message
shortage.
The changes include:
- Add a new pcib_if method: PCIB_MAP_MSI() which is called by the PCI bus
to calculate the address and data values for a given MSI/MSI-X IRQ.
The x86 nexus drivers map this into a call to a new 'msi_map()' function
in msi.c that does the mapping.
- Retire the pcib_if method PCIB_REMAP_MSIX() and remove the 'index'
parameter from PCIB_ALLOC_MSIX(). MD code no longer has any knowledge
of the MSI-X index for a given MSI-X IRQ.
- The PCI bus driver now stores more MSI-X state in a child's ivars.
Specifically, it now stores an array of IRQs (called "message vectors" in
the code) that have associated address and data values, and a small
virtual version of the MSI-X table that specifies the message vector
that a given MSI-X table entry uses. Sparse mappings are permitted in
the virtual table.
- The PCI bus driver now configures the MSI and MSI-X address/data
registers directly via custom bus_setup_intr() and bus_teardown_intr()
methods. pci_setup_intr() invokes PCIB_MAP_MSI() to determine the
address and data values for a given message as needed. The MD code
no longer has to call back down into the PCI bus code to set these
values from the nexus' bus_setup_intr() handler.
- The PCI bus code provides a callout (pci_remap_msi_irq()) that the MD
code can call to force the PCI bus to re-invoke PCIB_MAP_MSI() to get
new values of the address and data fields for a given IRQ. The x86
MSI code uses this when an MSI IRQ is moved to a different CPU, requiring
a new value of the 'address' field.
- The x86 MSI psuedo-driver loses a lot of code, and in fact the separate
MSI/MSI-X pseudo-PICs are collapsed down into a single MSI PIC driver
since the only remaining diff between the two is a substring in a
bootverbose printf.
- The PCI bus driver will now restore MSI-X state (including programming
entries in the MSI-X table) on device resume.
- The interface for pci_remap_msix() has changed. Instead of accepting
indices for the allocated vectors, it accepts a mini-virtual table
(with a new length parameter). This table is an array of u_ints, where
each value specifies which allocated message vector to use for the
corresponding MSI-X message. A vector of 0 forces a message to not
have an associated IRQ. The device may choose to only use some of the
IRQs assigned, in which case the unused IRQs must be at the "end" and
will be released back to the system. This allows a driver to use the
same remap table for different shortage values. For example, if a driver
wants 4 messages, it can use the same remap table (which only uses the
first two messages) for the cases when it only gets 2 or 3 messages and
in the latter case the PCI bus will release the 3rd IRQ back to the
system.
MFC after: 1 month
specific request and thus should first try to be allocated from the
sys_resource pool. This avoids using the sys_resource pool for wildcard
requests that have bounded ranges coming from cbb(4) and Host-PCI pcib(4)
drivers.
Tested by: Andrea Bittau <a.bittau of cs.ucl.ac.uk fame>
Sleuthing by: Andrea Bittau as well
obtaining and releasing shared and exclusive locks. The algorithms for
manipulating the lock cookie are very similar to that rwlocks. This patch
also adds support for exclusive locks using the same algorithm as mutexes.
A new sx_init_flags() function has been added so that optional flags can be
specified to alter a given locks behavior. The flags include SX_DUPOK,
SX_NOWITNESS, SX_NOPROFILE, and SX_QUITE which are all identical in nature
to the similar flags for mutexes.
Adaptive spinning on select locks may be enabled by enabling the
ADAPTIVE_SX kernel option. Only locks initialized with the SX_ADAPTIVESPIN
flag via sx_init_flags() will adaptively spin.
The common cases for sx_slock(), sx_sunlock(), sx_xlock(), and sx_xunlock()
are now performed inline in non-debug kernels. As a result, <sys/sx.h> now
requires <sys/lock.h> to be included prior to <sys/sx.h>.
The new kernel option SX_NOINLINE can be used to disable the aforementioned
inlining in non-debug kernels.
The size of struct sx has changed, so the kernel ABI is probably greatly
disturbed.
MFC after: 1 month
Submitted by: attilio
Tested by: kris, pjd
one (hardware & global lock). This should address witness complaints that
a duplicate mutex is being acquired. Be sure to free the mutex to fix a
potential memory leak.
MFC after: 3 days
simpler. It now can just use rman_is_region_manager() during
acpi_release_resource() to see if the the resource is suballocated from
a system resource. Also, the driver no longer needs MD knowledge about
how to setup bus space tags and handles when doing a suballocation, but
can simply rely on bus_activate_resource() in the parent setting all that
up.
cause the EC to stop handling future events because the GPE stayed masked.
Set a flag when queueing a GPE handler since it will ultimately re-enable
the GPE. In all other cases, re-enable it ourselves. I reworked the
patch from the submitter.
Submitted by: Rong-en Fan <grafan@gmail.com>
most systems, it causes the EC not to respond for some Acer and Compaq/HP
laptops. This is the default value for Linux also. For systems that need
it, burst mode can be enabled via the tunable/sysctl:
debug.acpi.ec.burst="1"
acpi module. Also clean up print of args a little.
This was accidentally committed as 1.9.2.3 in the stable branch. Since it
is harmless, I will let the "insta-MFC" stand unless there is a problem.
EC occasionally times out and provides bogus values (3000C). This change
prevents those systems from prematurely shutting down while we work on the
underlying problem. Also, bump the sanity value to 0...200C from 0...150C.
case where it asynchronously exits burst mode on its own. Handle different
values of hz in sleep loop. Provide more debugging options to tune EC
behavior. These tunables/sysctls may be temporary and are not for user
access if the EC is working properly. Burst mode is now on by default for
testing and the poll interval has been increased from 100 to 500 us and
total timeout from 100 to 500 ms.
Hopefully this should be the first step of addressing reports of timeout
errors during battery or thermal access, especially on HP/Compaq laptops.
It is reasonably stable and should not cause a loss of functionality or
performance on systems that were previously working. Testing shows an
increase of responsiveness by ~75% on one system.
PR: kern/98171
triggers a KASSERT) or local variables. In the case of kern_ndis, the
tsleep() actually used a common sleep address (curproc) making it
susceptible to a premature wakeup.
- First off, device drivers really do need to know if they are allocating
MSI or MSI-X messages. MSI requires allocating powerof2() messages for
example where MSI-X does not. To address this, split out the MSI-X
support from pci_msi_count() and pci_alloc_msi() into new driver-visible
functions pci_msix_count() and pci_alloc_msix(). As a result,
pci_msi_count() now just returns a count of the max supported MSI
messages for the device, and pci_alloc_msi() only tries to allocate MSI
messages. To get a count of the max supported MSI-X messages, use
pci_msix_count(). To allocate MSI-X messages, use pci_alloc_msix().
pci_release_msi() still handles both MSI and MSI-X messages, however.
As a result of this change, drivers using the existing API will only
use MSI messages and will no longer try to use MSI-X messages.
- Because MSI-X allows for each message to have its own data and address
values (and thus does not require all of the messages to have their
MD vectors allocated as a group), some devices allow for "sparse" use
of MSI-X message slots. For example, if a device supports 8 messages
but the OS is only able to allocate 2 messages, the device may make the
best use of 2 IRQs if it enables the messages at slots 1 and 4 rather
than default of using the first N slots (or indicies) at 1 and 2. To
support this, add a new pci_remap_msix() function that a driver may call
after a successful pci_alloc_msix() (but before allocating any of the
SYS_RES_IRQ resources) to allow the allocated IRQ resources to be
assigned to different message indices. For example, from the earlier
example, after pci_alloc_msix() returned a value of 2, the driver would
call pci_remap_msix() passing in array of integers { 1, 4 } as the
new message indices to use. The rid's for the SYS_RES_IRQ resources
will always match the message indices. Thus, after the call to
pci_remap_msix() the driver would be able to access the first message
in slot 1 at SYS_RES_IRQ rid 1, and the second message at slot 4 at
SYS_RES_IRQ rid 4. Note that the message slots/indices are 1-based
rather than 0-based so that they will always correspond to the rid
values (SYS_RES_IRQ rid 0 is reserved for the legacy INTx interrupt).
To support this API, a new PCIB_REMAP_MSIX() method was added to the
pcib interface to change the message index for a single IRQ.
Tested by: scottl
modern dual-core systems as well.
- Parse the _CST packages for each cpu and track all the states individually,
on a per-cpu basis.
- Revert to generic FADT/P_BLK based Cx control if the _CST package
is not present on all cpus. In that case, the new driver will
still support per-cpu Cx state handling. The driver will determine the
highest Cx level that can be supported by all the cpus and configure the
available Cx state based on that.
- Fixed the case where multiple cpus in the system share the same
registers for Cx state handling. To do that, added a new flag
parameter to the acpi_PkgGas and acpi_bus_alloc_gas functions that
enable the caller to add the RF_SHAREABLE flag. This flag could also be
useful to other callers (acpi_throttle?) in the tree but this change is
not yet made.
- For Core Duo cpus, both cores seems to be taken out of C3 state when
any one of the cores need to transition out. This broke the short sleep
detection logic. It is disabled now if there is more than one cpu in
the system for now as it fixed it in my case. This quirk may need to
be re-enabled later differently.
- Added support to control cx_lowest on a per-cpu basis. There is still
a generic cx_lowest to enable changing cx_lowest for all cpus with a single
sysctl and for ease of use. Sample output for the new sysctl:
dev.cpu.0.cx_supported: C1/1 C2/1 C3/57
dev.cpu.0.cx_lowest: C3
dev.cpu.0.cx_usage: 0.00% 43.16% 56.83%
dev.cpu.1.cx_supported: C1/1 C2/1 C3/57
dev.cpu.1.cx_lowest: C3
dev.cpu.1.cx_usage: 0.00% 45.65% 54.34%
hw.acpi.cpu.cx_lowest: C3
This work was done by Stephane E. Potvin with some simple reworking by
myself. Thank you.
Submitted by: Stephane E. Potvin <sepotvin / videotron.ca>
MFC after: 2 weeks
return an error since it returns a count of battery devices in the system.
Set it to 0 explicitly, since it is the only switch branch that doesn't set
it.
# I guess no one uses it.
pcib_alloc_msix() methods instead of using the method from the generic
PCI-PCI bridge driver as the PCI-PCI methods will be gaining some PCI-PCI
specific logic soon.
- Add 3 new functions to the pci_if interface along with suitable wrappers
to provide the device driver visible API:
- pci_alloc_msi(dev, int *count) backed by PCI_ALLOC_MSI(). '*count'
here is an in and out parameter. The driver stores the desired number
of messages in '*count' before calling the function. On success,
'*count' holds the number of messages allocated to the device. Also on
success, the driver can access the messages as SYS_RES_IRQ resources
starting at rid 1. Note that the legacy INTx interrupt resource will
not be available when using MSI. Note that this function will allocate
either MSI or MSI-X messages depending on the devices capabilities and
the 'hw.pci.enable_msix' and 'hw.pci.enable_msi' tunables. Also note
that the driver should activate the memory resource that holds the
MSI-X table and pending bit array (PBA) before calling this function
if the device supports MSI-X.
- pci_release_msi(dev) backed by PCI_RELEASE_MSI(). This function
releases the messages allocated for this device. All of the
SYS_RES_IRQ resources need to be released for this function to succeed.
- pci_msi_count(dev) backed by PCI_MSI_COUNT(). This function returns
the maximum number of MSI or MSI-X messages supported by this device.
MSI-X is preferred if present, but this function will honor the
'hw.pci.enable_msix' and 'hw.pci.enable_msi' tunables. This function
should return the largest value that pci_alloc_msi() can return
(assuming the MD code is able to allocate sufficient backing resources
for all of the messages).
- Add default implementations for these 3 methods to the pci_driver generic
PCI bus driver. (The various other PCI bus drivers such as for ACPI and
OFW will inherit these default implementations.) This default
implementation depends on 4 new pcib_if methods that bubble up through
the PCI bridges to the MD code to allocate IRQ values and perform any
needed MD setup code needed:
- PCIB_ALLOC_MSI() attempts to allocate a group of MSI messages.
- PCIB_RELEASE_MSI() releases a group of MSI messages.
- PCIB_ALLOC_MSIX() attempts to allocate a single MSI-X message.
- PCIB_RELEASE_MSIX() releases a single MSI-X message.
- Add default implementations for these 4 methods that just pass the
request up to the parent bus's parent bridge driver and use the
default implementation in the various MI PCI bridge drivers.
- Add MI functions for use by MD code when managing MSI and MSI-X
interrupts:
- pci_enable_msi(dev, address, data) programs the MSI capability address
and data registers for a group of MSI messages
- pci_enable_msix(dev, index, address, data) initializes a single MSI-X
message in the MSI-X table
- pci_mask_msix(dev, index) masks a single MSI-X message
- pci_unmask_msix(dev, index) unmasks a single MSI-X message
- pci_pending_msix(dev, index) returns true if the specified MSI-X
message is currently pending
- Save the MSI capability address and data registers in the pci_cfgreg
block in a PCI devices ivars and restore the values when a device is
resumed. Note that the MSI-X table is not currently restored during
resume.
- Add constants for MSI-X register offsets and fields.
- Record interesting data about any MSI-X capability blocks we come
across in the pci_cfgreg block in the ivars for PCI devices.
Tested on: em (i386, MSI), bce (amd64/i386, MSI), mpt (amd64, MSI-X)
Reviewed by: scottl, grehan, jfv
MFC after: 2 months
WB (write-back) on x86 via control bits in PTEs and PDEs (including making
use of the PAT MSR). Changes include:
- A new pmap_mapdev_attr() function for amd64 and i386 which takes an
additional parameter (relative to pmap_mapdev()) specifying the cache
mode for this mapping. Note that on amd64 only WB mappings are done with
the direct map, all other modes result in a private mapping.
- pmap_mapdev() on i386 and amd64 now defaults to using UC (uncached)
mappings rather than WB. Previously we relied on the BIOS setting up
MTRR's to enforce memio regions being treated as UC. This might make
hw.cbb_start_memory unnecessary in some cases now for example.
- A new pmap_mapbios()/pmap_unmapbios() API has been added to allow places
that used pmap_mapdev() to map non-device memory (such as ACPI tables)
to do so using WB as before.
- A new pmap_change_attr() function for amd64 and i386 that changes the
caching mode for a range of KVA.
Reviewed by: alc
function independently. This change is not only load-tested since I don't
have hardware that supports acpi_dock. Clean up comments and a name a
few constants.
that aren't listed as valid in the link device's set of possible IRQs.
This allows the hints to be used to work around broken BIOSes that don't
specify the correct ste of possible IRQs. A warning is issued in the
dmesg in this case to be consistent with the $PIR handling code.
MFC after: 1 week
perform the reboot action via the reset register instead of our legacy
method. Default is 0 (use legacy). This is needed because some systems
hang on reboot even though they claim to support the reset register.
MFC after: 2 days
Prevent casual modification by requiring hw.acpi.thermal.user_override to
be set first. Fix printing of negative temperatures in the K->C conversion.
Document the remaining thermal sysctls.
MFC after: 3 days
systems. Introduce a new sysctl "hw.acpi.disable_on_reboot" that allows
users to re-enable the old behavior in case it's needed for some systems.
We never disable in the power-off path.
Original approach submitted by Alexander Logvinov <abuse@akavia.ru> with
reworking by Jung-uk Kim and myself.
Move the code for printing timer statistics into a test function instead of
an ifdef (accessible via the debug.acpi.hpet_test tunable). Also use defines
for register offsets instead of magic values.
Courtesy of: slow flight to HK
If the embedded controller exists before the sysresource devices, for
example, it will be attached first. Instead, let the normal device
order function work as we first desired. [1]
There still remained a problem where we couldn't allocate resources in
acpi0 that were passed up by the sysresource pseudo-devices. These
devices had to probe/attach first to give their resources to acpi, then
acpi would allocate them before probing/attaching other devices. To
work around this, we attach them from acpi_sysres_alloc(). A better
approach would be to implement multi-pass probe/attach in newbus but
that's a much bigger task.
Suggested by: jhb [1]
Hardware from: Centaur Technologies
MFC after: 1 week