Commit Graph

1253 Commits

Author SHA1 Message Date
jhb
4a26c9bbdf Add a pcib_attach_child() method to manage adding the child "pci" device.
This allows the PCI-PCI bridge driver to save a reference to the child
device in its softc.

Note that this required moving the "pci" device creation out of
acpi_pcib_attach().  Instead, acpi_pcib_attach() is renamed to
acpi_pcib_fetch_prt() as it's sole action now is to fetch the PCI
interrupt routing table.

Differential Revision:	https://reviews.freebsd.org/D6021
2016-04-27 16:39:05 +00:00
jhb
da15c11c31 Optionally return the output capabilities list from _OSC.
Both of the callers were expecting the input cap_set to be modified.
This fixes them to request cap_set to be updated with the returned buffer.

Reviewed by:	jkim
Differential Revision:	https://reviews.freebsd.org/D6040
2016-04-22 17:51:19 +00:00
jhb
c514278fc0 Queue the CPU-probing task after all acpi_cpu devices are attached.
Eventually with earlier AP startup this code will change to call the
startup function synchronously instead of queueing the task.  Moving
the time we queue the task should be a no-op since taskqueue threads
don't start executing tasks until much later, but this reduces the diff
with the earlier AP startup patches.

Sponsored by:	Netflix
2016-04-21 18:27:05 +00:00
jkim
7cc967f0e0 Prefer sizeof(*pointer) over sizeof(type). No funtional change. 2016-04-20 21:30:56 +00:00
jkim
1bcda70984 There is no need to use array any more. No functional change. 2016-04-20 21:26:59 +00:00
jkim
c6fadd2115 Remove query flag from acpi_EvaluateOSC(). This function does not support
return buffer (yet).
2016-04-20 21:21:47 +00:00
jhb
db2ee74814 Invoke _OSC on Host-PCI bridges.
Tell the firmware that we support PCI-express config space access
and MSI.

Reviewed by:	jkim
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D6023
2016-04-20 20:58:30 +00:00
jhb
11a5d8cc21 Add a wrapper for evaluating _OSC methods.
This wrapper does not translate errors in the first word to ACPI
error status returns.  Use this wrapper in the acpi_cpu(4) driver in
place of the existing _OSC code.  While here, fix a bug where the wrong
count of words was passed when invoking _OSC.

Reviewed by:	jkim
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D6022
2016-04-20 20:55:58 +00:00
jhb
784a797eed Add a new PCI bus interface method to alloc the ivars (dinfo) for a device.
The ACPI and OFW PCI bus drivers as well as CardBus override this to
allocate the larger ivars to hold additional info beyond the stock PCI ivars.

This removes the need to pass the size to functions like pci_add_iov_child()
and pci_read_device() simplifying IOV and bus rescanning implementations.

As a result of this and earlier changes, the ACPI PCI bus driver no longer
needs its own device_attach and pci_create_iov_child methods but can use
the methods in the stock PCI bus driver instead.

Differential Revision:	https://reviews.freebsd.org/D5891
2016-04-15 03:42:12 +00:00
jhb
6beb82443a Add more fine-grained kernel options for NUMA support.
VM_NUMA_ALLOC is used to enable use of domain-aware memory allocation in
the virtual memory system.  DEVICE_NUMA is used to enable affinity
reporting for devices such as bus_get_domain().

MAXMEMDOM must still be set to a value greater than for any NUMA support
to be effective.  Note that 'cpuset -gd' always works if MAXMEMDOM is
enabled and the system supports NUMA.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D5782
2016-04-09 13:58:04 +00:00
jhb
0722712486 Associate device_t objects with ACPI handles via PCI_CHILD_ADDED().
Previously, the ACPI PCI bus driver did a single pass over the devices in
the namespace that were a child of a given PCI bus to associate the
PCI bus-enumerated device_t devices with the corresponding ACPI handles.
However, this meant that handles were only established at runtime for devices
found during the initial PCI bus scan.

PCI_IOV adds devices that show up after the initial PCI bus scan, and coming
changes to add a bus rescan can also add devices after the initial scan.

This change adds a pci_child_added() callback to the ACPI PCI bus that walks
the namespace to find the ACPI handle for each device that is added.  Using
a callback means that the handle is correctly set for any device no matter
how it is added (initial scan, IOV, or a bus rescan).
2016-04-07 17:15:16 +00:00
jhb
01f4e87387 Convert pci_delete_child() to a bus_child_deleted() method.
Instead of providing a wrapper around device_delete_child() that the PCI
bus and child bus drivers must call explicitly, move the bulk of the logic
from pci_delete_child() into a bus_child_deleted() method
(pci_child_deleted()).  This allows PCI devices to be safely deleted via
device_delete_child().
- Add a bus_child_deleted method to the ACPI PCI bus which clears the
  device_t associated with the corresponding ACPI handle in addition to
  the normal PCI bus cleanup.
- Change cardbus_detach_card to call device_delete_children() and move
  CardBus-specific delete logic into a new cardbus_child_deleted() method.
- Use device_delete_child() instead of pci_delete_child() in the SRIOV code.
- Add a bus_child_deleted method to the OpenFirmware PCI bus drivers which
  frees the OpenFirmware device info for each PCI device.

Reviewed by:	imp
Tested on:	amd64 (CardBus and PCI-e hotplug)
Differential Revision:	https://reviews.freebsd.org/D5831
2016-04-06 04:10:22 +00:00
jhibbits
720f47c9ed Use uintmax_t (typedef'd to rman_res_t type) for rman ranges.
On some architectures, u_long isn't large enough for resource definitions.
Particularly, powerpc and arm allow 36-bit (or larger) physical addresses, but
type `long' is only 32-bit.  This extends rman's resources to uintmax_t.  With
this change, any resource can feasibly be placed anywhere in physical memory
(within the constraints of the driver).

Why uintmax_t and not something machine dependent, or uint64_t?  Though it's
possible for uintmax_t to grow, it's highly unlikely it will become 128-bit on
32-bit architectures.  64-bit architectures should have plenty of RAM to absorb
the increase on resource sizes if and when this occurs, and the number of
resources on memory-constrained systems should be sufficiently small as to not
pose a drastic overhead.  That being said, uintmax_t was chosen for source
clarity.  If it's specified as uint64_t, all printf()-like calls would either
need casts to uintmax_t, or be littered with PRI*64 macros.  Casts to uintmax_t
aren't horrible, but it would also bake into the API for
resource_list_print_type() either a hidden assumption that entries get cast to
uintmax_t for printing, or these calls would need the PRI*64 macros.  Since
source code is meant to be read more often than written, I chose the clearest
path of simply using uintmax_t.

Tested on a PowerPC p5020-based board, which places all device resources in
0xfxxxxxxxx, and has 8GB RAM.
Regression tested on qemu-system-i386
Regression tested on qemu-system-mips (malta profile)

Tested PAE and devinfo on virtualbox (live CD)

Special thanks to bz for his testing on ARM.

Reviewed By: bz, jhb (previous)
Relnotes:	Yes
Sponsored by:	Alex Perez/Inertial Computing
Differential Revision: https://reviews.freebsd.org/D4544
2016-03-18 01:28:41 +00:00
jhibbits
ac542b0dba Remove default initializations for rman, a'la r296331 2016-03-04 01:25:45 +00:00
jkim
dcabc402a8 Silence PVS-Studio warning (V595). 2016-02-23 23:09:45 +00:00
jkim
627a0c8130 Silence PVS-Studio warning (V595). 2016-02-23 22:55:44 +00:00
jkim
17b02dde28 Remove brightness notify handler before reinstalling new one. 2016-02-23 22:50:45 +00:00
jkim
0a369764c2 Fix white spaces. 2016-02-23 22:30:45 +00:00
jkim
cfe709a284 Fix style(9) bugs. 2016-02-23 22:22:15 +00:00
kib
9b01734b01 Some BIOSes ACPI bytecode needs to take (sleepable) acpi mutex for
acpi_GetInteger() execution.  Intel DMAR interrupt remapping code
needs to know UID of the HPET to properly route the FSB interrupts
from the HPET, even when interrupt remapping is disabled, and the code
is executed under some non-sleepable mutexes.

Cache HPET UIDs in the device softc at the attach time and provide
lock-less method to get UID, use the method from the dmar hpet
handling code instead of calling GetInteger().

Reported and tested by:	Larry Rosenman <ler@lerctr.org>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-02-20 13:37:04 +00:00
kib
efb12984b6 Switch /dev/hpet to use make_dev_s(9). Device needs si_drv1
initializated, do it correctly even though hpet cannot be loaded as
module.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-02-20 13:21:59 +00:00
jhibbits
f8385663ee Introduce a RMAN_IS_DEFAULT_RANGE() macro, and use it.
This simplifies checking for default resource range for bus_alloc_resource(),
and improves readability.

This is part of, and related to, the migration of rman_res_t from u_long to
uintmax_t.

Discussed with:	jhb
Suggested by:	marcel
2016-02-20 01:32:58 +00:00
adrian
f0291bb6ee document some ACPI related sysctls.
Submitted by:	Oliver Pinter <oliver.pinter@hardenedbsd.org>
Sponsored by:	HardenedBSD
Differential Revision:	https://reviews.freebsd.org/D5263
2016-02-19 05:02:17 +00:00
jkim
a44f626a58 Remove a bogus bzero() call.
Found by:	PVS-Studio
2016-02-18 23:32:11 +00:00
jhibbits
31bb8ee5bd Convert rman to use rman_res_t instead of u_long
Summary:
Migrate to using the semi-opaque type rman_res_t to specify rman resources.  For
now, this is still compatible with u_long.

This is step one in migrating rman to use uintmax_t for resources instead of
u_long.

Going forward, this could feasibly be used to specify architecture-specific
definitions of resource ranges, rather than baking a specific integer type into
the API.

This change has been broken out to facilitate MFC'ing drivers back to 10 without
breaking ABI.

Reviewed By: jhb
Sponsored by:	Alex Perez/Inertial Computing
Differential Revision: https://reviews.freebsd.org/D5075
2016-01-27 02:23:54 +00:00
cperciva
8e136c4370 Disable suspend when we're shutting down. This solves the "tell FreeBSD
to shut down; close laptop lid" scenario which otherwise tended to end
with a laptop overheating or the battery dying.

The implementation uses a new sysctl, kern.suspend_blocked; init(8) sets
this while rc.suspend runs, and the ACPI sleep code ignores requests while
the sysctl is set.

Discussed on:	freebsd-acpi (35 emails)
MFC after:	1 week
2015-10-01 10:52:26 +00:00
zbb
95f13176f5 Add domain support to PCI bus allocation
When the system has more than a single PCI domain, the bus numbers
are not unique, thus they cannot be used for "pci" device numbering.
Change bus numbers to -1 (i.e. to-be-determined automatically)
wherever the code did not care about domains.

Reviewed by:   jhb
Obtained from: Semihalf
Sponsored by:  The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3406
2015-09-16 23:34:51 +00:00
jkim
0481c185aa Merge ACPICA 20150818. 2015-08-26 17:13:47 +00:00
jkim
28864839a5 Catch up with ACPICA 20150717. 2015-07-22 16:26:17 +00:00
andrew
cd458c627d Add basic support for ACPI. It splits out the nexus driver to two new
drivers, one for fdt, one for acpi. It then uses this to decide if it will
use fdt or acpi.

The GICv2 (interrupt controller) and Generic Timer drivers have been
updated to handle both cases.

As this is early code we still need FDT to find the kernel console, and
some parts are still missing, including PCI support.

Differential Revision:	https://reviews.freebsd.org/D2463
Reviewed by:	jhb, jkim, emaste
Obtained from:	ABT Systems Ltd
Relnotes:	Yes
Sponsored by:	The FreeBSD Foundation
2015-06-11 15:45:33 +00:00
jkim
f345babbfc Check status of AcpiReadBitRegister() calls.
Reported by:	Coverity
CID:		1306132
2015-06-09 23:13:37 +00:00
jkim
318c4f97e6 CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head.  However, it is continuously misused as the mpsafe argument
for callout_init(9).  Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision:	https://reviews.freebsd.org/D2613
Reviewed by:	jhb
MFC after:	2 weeks
2015-05-22 17:05:21 +00:00
jkim
d8ed876b18 Do not probe Intel PIIX4 south bridge quirks on amd64. These quirky south
bridges only supported Intel Pentium and Pentium II era processors and there
is no reason for hardware virtualizations to emulate these quirks.

MFC after:	1 week
2015-05-21 19:31:10 +00:00
andrew
6e37027826 Hide code only used on i386 and amd64. 2015-05-11 14:36:34 +00:00
kib
6006bf3a7d If x86 CPU implementation of the MWAIT instruction reasonably
interacts with interrupts, query ACPI and use MWAIT for entrance into
Cx sleep states.  Support C1 "I/O then halt" mode.  See Intel'
document 302223-007 "Intelб╝ Processor Vendor-Specific ACPI Interface
Specification" for description.

Move the acpi_cpu_c1() function into x86/cpu_machdep.c and use
it instead of inlining "sti; hlt" sequence in several places.

In the acpi(4) man page, besides documenting the dev.cpu.N.cx_methods
sysctl, correct the names for dev.cpu.N.{cx_usage,cx_lowest,cx_supported}
sysctls.

Both jkim and avg have some other patches implementing the mwait
functionality; this work is unrelated.  Linux does not rely on the
ACPI to provide correct tables describing Cx modes.  Instead, the
driver has pre-defined knowledge of the CPU models, it was supplied by
Intel.

Tested by:    pho (previous versions)
Sponsored by:	The FreeBSD Foundation
2015-05-09 12:28:48 +00:00
andrew
c7931d093f AcpiGbl_FACS will not be defined when building using the reduced hardware
model. This may be the case on ARM.
2015-05-06 14:14:14 +00:00
andrew
9b8d511aaf If the power management timer is unsupported the PmTimerLength value will
be zero.
2015-05-06 14:09:54 +00:00
andrew
1a7037c6cf There may not be an FACS table, check for this before accessing it.
Sponsored by:	The FreeBSD Foundation
2015-04-28 16:06:58 +00:00
adrian
a189b7bcb9 Refactor out the _PXM -> VM domain lookup done in ACPI, in preparation for
its use in upcoming code.

This is inspired by something in jhb's NUMA IRQ allocation patchset.

However, the tricky bit here is that the PXM lookup for a node may
fail, requiring a lookup on the parent node.  So if it doesn't
exist, don't fail - just go up to the parent.  Only error out of the
lookup is the ACPI lookup returns an error.

Sponsored by:	Norse Corp, Inc.
2015-04-19 17:15:55 +00:00
kib
1bd8147e56 Define capabilities bits from the revision 007 of the document 302223
"Intelб╝ Processor Vendor-Specific ACPI Interface Specification",
issied Dec 2014.  Previous revision 005 was from Sep 2006.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-04-12 10:28:15 +00:00
jkim
94f110f313 Merge ACPICA 20150410. 2015-04-11 03:23:41 +00:00
jhb
5157da51af Move the message complaining about failed system resource allocations
under bootverbose.  Every example I've seen to date has been due to
an ACPI system resource device reserving a range that overlaps with
system memory (which ram0 attempts to reserve) or a local or I/O APIC
(which apic0 attempts to reserve).  These are always harmless but look
scary to users.

MFC after:	1 week
2015-04-06 17:39:36 +00:00
jhb
7d16cdc7ad Fix a typo. 2015-03-06 20:53:56 +00:00
rstone
e40d09375f Implement interface to create SR-IOV Virtual Functions
Implement the interace to create SR-IOV Virtual Functions (VFs).
When a driver registers that they support SR-IOV by calling
pci_setup_iov(), the SR-IOV code creates a new node in /dev/iov
for that device.  An ioctl can be invoked on that device to
create VFs and have the driver initialize them.

At this point, allocating memory I/O windows (BARs) is not
supported.

Differential Revision:	https://reviews.freebsd.org/D76
Reviewed by:		jhb
MFC after: 		1 month
Sponsored by:		Sandvine Inc.
2015-03-01 00:40:09 +00:00
kib
e8f9899e24 Array cannot be NULL, remove always true comparision. ACPI spec
identifies the tested condition for _PRT as "BYTE value of 0", so the
remaining part of the conditionals is sufficient.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-02-16 22:18:43 +00:00
jhb
571edab7e4 Add a new device control utility for new-bus devices called devctl. This
allows the user to request administrative changes to individual devices
such as attach or detaching drivers or disabling and re-enabling devices.
- Add a new /dev/devctl2 character device which uses ioctls for device
  requests.  The ioctls use a common 'struct devreq' which is somewhat
  similar to 'struct ifreq'.
- The ioctls identify the device to operate on via a string.  This
  string can either by the device's name, or it can be a bus-specific
  address.  (For unattached devices, a bus address is the only way to
  locate a device.)  Bus drivers register an eventhandler to claim
  unrecognized device names that the driver recognizes as a valid address.
  Two buses currently support addresses: ACPI recognizes any device
  in the ACPI namespace via its full path starting with "\" and
  the PCI bus driver recognizes an address specification of
  'pci[<domain>:]<bus>:<slot>:<func>' (identical to the PCI selector
  strings supported by pciconf).
- To make it easier to cut and paste, change the PnP location string
  in the PCI bus driver to output a full PCI selector string rather
  than 'slot=<slot> function=<func>'.
- Add a devctl(3) interface in libdevctl which provides a wrapper around
  the ioctls and is the preferred interface for other userland code.
- Add a devctl(8) program which is a simple wrapper around the requests
  supported by devctl(3).
- Add a device_is_suspended() function to check DF_SUSPENDED.
- Add a resource_unset_value() function that can be used to remove a
  hint from the kernel environment.  This is used to clear a
  hint.<driver>.<unit>.disabled hint when re-enabling a boot-time
  disabled device.

Reviewed by:	imp (parts)
Requested by:	imp (changing PCI location string)
Relnotes:	yes
2015-02-06 16:09:01 +00:00
avg
2a1f5c1f69 hook userland threads suspend + resume into acpi suspend code
Also, split power_suspend into power_suspend and power_suspend_early.

power_suspend_early is called before the userland is frozen.
power_suspend is called after the userland is frozen.

Currently only VT switching is hooked to power_suspend_early.
This is needed because switching away from X server requires its
cooperation, so obviously X server must not be frozen when that happens.

Freezing userland during ACPI suspend is useful because not all drivers
correctly handle suspension concurrent with other activity.  This is
especially applicable to drivers ported from other operating systems
that suspend all software activity between placing drivers and hardware
into suspended state.
In particular drm2/radeon (radeonkms) depends on the described
procedure.  The driver does not have any internal synchronization
between suspension activities and processing of userland requests.

Many thanks to kib for the code that allows to freeze and thaw all
userland threads.

Note that ideally we also need to park / inhibit (non-special) kernel
threads as well to ensure that they do not call into drivers.

MFC after:	17 days
2015-01-27 17:33:18 +00:00
jkim
218f4108d5 Simplify retry loops. No functional change. 2015-01-23 18:55:04 +00:00
jkim
c3bbac9fac Revert r216942. This commit was premature and caused too many complaints.
PR:		162859
MFC after:	3 days
2015-01-23 18:12:44 +00:00
cperciva
37bcfe05fd When disabling C3+ CPU states due to the CPU_QUIRK_NO_C3 quirk, don't
accidentally enable non-existent states.

This bug was triggered if ACPI advertises the presence of a C2 state
which we fail to parse via acpi_PkgGas due to our lack of support for
FFixedHW resources, and causes an immediate panic when an attempt is
made to enter the (NULL) state.

One affected platform is the EC2 c4.8xlarge VM instance type; there
may be others.

MFC after:	1 week
Thanks to:	jkim, @_msw_
2015-01-18 12:45:26 +00:00
jhb
55d0376a65 On some Intel CPUs with a P-state but not C-state invariant TSC the TSC
may also halt in C2 and not just C3 (it seems that in some cases the BIOS
advertises its C3 state as a C2 state in _CST).  Just play it safe and
disable both C2 and C3 states if a user forces the use of the TSC as the
timecounter on such CPUs.

PR:		192316
Differential Revision:	https://reviews.freebsd.org/D1441
No objection from:	jkim
MFC after:	1 week
2015-01-05 20:44:44 +00:00
jkim
c82ae411ef Use the correct device. Note this commit complements r274386.
PR:		194884
2014-11-11 19:42:10 +00:00
adrian
2326e02319 Use the correct device (child) when asking the bus layer about which power
state said device should go into.

This was a snafu introduced in the ACPI/PCI awareness separation.

When putting a device into a power state, the bus (and thus firmware,
eg ACPI) should be asked before hand to check whether the device
can indeed go into that power state.

There's a set of nodes in ACPI under each device - the _SxD nodes - which
state which ACPI power state to put the device into when the system is
going into power save state 'x'.  So when going into S3, the existence
of an _S3D node would override whatever the system was trying to do.

By default the PCI code wants to put devices into D3 before suspending.

I have a laptop here (Asus Zenbook - check the PR) whose EHCI controller
really wants to be in D2 during suspend, not D3.  So if we put it into
D3 and then try to enter S3, everything hangs.  The device itself
can go into D3 - it just can't be there when the call to ACPI to enter
S3 occurs.  The PCI patch fixes this.

jkim@ noticed that the same is needed for the ACPI child device
enumeration.

Thankyou to Matt Dillon (the programmer, not the actor) for buying me
this particular laptop so I could debug the issues with the Atheros
AR9485 that is in it.  It's his fault that I ended up with this
laptop and was sufficiently annoyed by the lack of USB suspend
to go down this rabbit hole.

Tested:

* Thinkpad T400
* Thinkpad X230
* Thinkpad T42
* Thinkpad T60
* Asus Zenbook (see PR)
* Asus EEEPC 701
* Asus EEEPC 1001PX

TODO:

* Figure out what we should do about devices we unload drivers for
  that want to be in a specific state when entering S3 / S4 -
  the "put devices into D3 if they're not bound to a driver" option
  may also mess with things.

PR:		kern/194884
Reviewed by:	jhb, jkim
MFC after:	1 week
Relnotes:	yes
Sponsored by:	Matt Dillon <dillon@apollo.backplane.com> (hardware)
2014-11-11 17:14:11 +00:00
hselasky
a0b8ff0c54 The SYSCTL data pointers can come from userspace and must not be
directly accessed. Although this will work on some platforms, it can
throw an exception if the pointer is invalid and then panic the kernel.

Add a missing SYSCTL_IN() of "SCTP_BASE_STATS" structure.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-28 12:00:39 +00:00
kib
f40f3eb0c5 Set the caching mode for the usermode mapping of the HPET registers
page to uncached.

Reviewed by:	rpaulo
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2014-10-25 21:01:50 +00:00
rpaulo
d3c80736fe Add a sysctl to control the HPET allow_write behaviour.
Requested by:	kib
2014-10-24 21:08:36 +00:00
rpaulo
7ecde33ff4 HPET: avoid handling the multiple file-descriptor case.
It had two bugs: one where mmap was still allowed and another where
D_TRACKCLOSE doesn't handle all cases.

Thanks to jhb and kib for pointing them out.
MFC after:	1 week
2014-10-24 19:58:00 +00:00
rpaulo
65e80d25ca HPET: create /dev/hpetN as a way to access HPET from userland.
In some cases, TSC is broken and special applications might benefit
from memory mapping HPET and reading the registers to count time.
Most often the main HPET counter is 32-bit only[1], so this only gives
the application a 300 second window based on the default HPET
interval.
Other applications, such as Intel's DPDK, expect /dev/hpet to be
present and use it to count time as well.

Although we have an almost userland version of gettimeofday() which
uses rdtsc in userland, it's not always possible to use it, depending
on how broken the multi-socket hardware is.

Install the acpi_hpet.h so that applications can use the HPET register
definitions.

[1] I haven't found a system where HPET's main counter uses more than
32 bit.  There seems to be a discrepancy in the Intel documentation
(claiming it's a 64-bit counter) and the actual implementation (a
32-bit counter in a 64-bit memory area).

MFC after:	1 week
Relnotes:	yes
2014-10-24 18:39:15 +00:00
davide
e88bd26b3f Follow up to r225617. In order to maximize the re-usability of kernel code
in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv().
This fixes a namespace collision with libc symbols.

Submitted by:   kmacy
Tested by:      make universe
2014-10-16 18:04:43 +00:00
adrian
b0c040ce18 Add a bus method to fetch the VM domain for the given device/bus.
* Add a bus_if.m method - get_domain() - returning the VM domain or
  ENOENT if the device isn't in a VM domain;
* Add bus methods to print out the domain of the device if appropriate;
* Add code in srat.c to save the PXM -> VM domain mapping that's done and
  expose a function to translate VM domain -> PXM;
* Add ACPI and ACPI PCI methods to check if the bus has a _PXM attribute
  and if so map it to the VM domain;
* (.. yes, this works recursively.)
* Have the pci bus glue print out the device VM domain if present.

Note: this is just the plumbing to start enumerating information -
it doesn't at all modify behaviour.

Differential Revision:	D906
Reviewed by:	jhb
Sponsored by:	Norse Corp
2014-10-09 05:33:25 +00:00
jkim
3f8a9f0ec1 Merge ACPICA 20140926. 2014-10-02 19:11:18 +00:00
will
b61070b384 Add sysctl to track the resource consumption of ACPI interrupts.
Submitted by:	gibbs
MFC after:	1 month
Sponsored by:	Spectra Logic
MFSpectraBSD:	636827 on 2012/09/28
2014-10-01 14:35:52 +00:00
royger
c5a5f5947f msi: add Xen MSI implementation
This patch adds support for MSI interrupts when running on Xen. Apart
from adding the Xen related code needed in order to register MSI
interrupts this patch also makes the msi_init function a hook in
init_ops, so different MSI implementations can have different
initialization functions.

Sponsored by: Citrix Systems R&D

xen/interface/physdev.h:
 - Add the MAP_PIRQ_TYPE_MULTI_MSI to map multi-vector MSI to the Xen
   public interface.

x86/include/init.h:
 - Add a hook for setting custom msi_init methods.

amd64/amd64/machdep.c:
i386/i386/machdep.c:
 - Set the default msi_init hook to point to the native MSI
   initialization method.

x86/xen/pv.c:
 - Set the Xen MSI init hook when running as a Xen guest.

x86/x86/local_apic.c:
 - Call the msi_init hook instead of directly calling msi_init.

xen/xen_intr.h:
x86/xen/xen_intr.c:
 - Introduce support for registering/releasing MSI interrupts with
   Xen.
 - The MSI interrupts will use the same PIC as the IO APIC interrupts.

xen/xen_msi.h:
x86/xen/xen_msi.c:
 - Introduce a Xen MSI implementation.

x86/xen/xen_nexus.c:
 - Overwrite the default MSI hooks in the Xen Nexus to use the Xen MSI
   implementation.

x86/xen/xen_pci.c:
 - Introduce a Xen specific PCI bus that inherits from the ACPI PCI
   bus and overwrites the native MSI methods.
 - This is needed because when running under Xen the MSI messages used
   to configure MSI interrupts on PCI devices are written by Xen
   itself.

dev/acpica/acpi_pci.c:
 - Lower the quality of the ACPI PCI bus so the newly introduced Xen
   PCI bus can take over when needed.

conf/files.i386:
conf/files.amd64:
 - Add the newly created files to the build process.
2014-09-30 16:46:45 +00:00
jhb
d08fb7f877 Convert from timeout(9) to callout(9). 2014-09-22 14:27:26 +00:00
adrian
fece021428 Populate the device info string with _PXM (proximity domain) information.
This is primarily useful for debugging right now - it'll show up in
devinfo.

Reviewed by:	jhb
2014-09-20 04:31:12 +00:00
jhb
5c3c9f4571 Revert unrelated changes accidentally committed in r271192. 2014-09-17 18:55:39 +00:00
jhb
3a8cf1a38b Create a separate structure for per-CPU state saved across suspend and
resume that is a superset of a pcb.  Move the FPU state out of the pcb and
into this new structure.  As part of this, move the FPU resume code on
amd64 into a C function.  This allows resumectx() to still operate only on
a pcb and more closely mirrors the i386 code.

Reviewed by:	kib (earlier version)
2014-09-06 15:23:28 +00:00
neel
ccce21b061 Fix typo when displaying the HPET timer unit number. 2014-08-13 00:18:16 +00:00
royger
8daf97263e xen: add ACPI bus to xen_nexus when running as Dom0
Also disable a couple of ACPI devices that are not usable under Dom0.
To this end a couple of booleans are added that allow disabling ACPI
specific devices.

Sponsored by: Citrix Systems R&D
Reviewed by: jhb

x86/xen/xen_nexus.c:
 - Return BUS_PROBE_SPECIFIC in the Xen Nexus attachement routine to
   force the usage of the Xen Nexus.
 - Attach the ACPI bus when running as Dom0.

dev/acpica/acpi_cpu.c:
dev/acpica/acpi_hpet.c:
dev/acpica/acpi_timer.c
 - Add a variable that gates the addition of the devices.

x86/include/init.h:
 - Declare variables that control the attachment of ACPI cpu, hpet and
   timer devices.
2014-08-04 09:05:28 +00:00
marcel
9f28abd980 Remove ia64.
This includes:
o   All directories named *ia64*
o   All files named *ia64*
o   All ia64-specific code guarded by __ia64__
o   All ia64-specific makefile logic
o   Mention of ia64 in comments and documentation

This excludes:
o   Everything under contrib/
o   Everything under crypto/
o   sys/xen/interface
o   sys/sys/elf_common.h

Discussed at: BSDcan
2014-07-07 00:27:09 +00:00
hselasky
35b126e324 Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
gjb
fc21f40567 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
hselasky
bd1ed65f0f Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
hselasky
58998f2b0b Remove not needed initialisation code. 2014-06-26 10:48:01 +00:00
jhb
82be9401d7 Expand r261243 even further and ignore any I/O port resources assigned to
PCI root bridges except for the one known-valid case on x86 where bridges
claim the I/O port registers used for PCI config space access.

Tested by:	Hilko Meyer <hilko.meyer@gmx.de>
MFC after:	1 week
2014-06-25 20:30:47 +00:00
jhb
c161353534 Trust the state of a power resource that get from a working _STA method
instead of trying to cache it.

Previously, we only trusted the state if we did not have a cached state.
However, once a state was cached, the _STA method was always ignored.
Specifically, once a power resource had been turned on once (e.g.
during resume), the driver assumed it was always on even if _STA said it
was off and never turned it back on.  This prevented the power resource
from being turned back on if a laptop was resumed twice, for example.

To fix, just remove the cached state entirely and always use the results
of _STA.  The loops already skip any resources where _STA fails.

Submitted by:	trasz (initial patch to invoke _ON)
MFC after:	1 week
2014-06-19 18:35:14 +00:00
smh
5008097060 Remove duplicate SYSCTL_DECL(_debug_acpi) which was breaking tinderbox
MFC after:	2 weeks
X-MFC-With: r264849
2014-04-24 14:58:12 +00:00
smh
d71204e7c2 Increase ACPI_MAX_TASKS to be 4 x the number of CPU's as 2 x was still
insufficient on some machines

MFC after:	2 weeks
2014-04-24 12:38:07 +00:00
smh
d311afade4 Exposed debug.acpi.max_tasks and debug.acpi.max_threads via sysctls so their
values can be viewed.
2014-04-24 00:41:02 +00:00
adrian
3a9d485daa Add a basic set of data points which count the number of sleep entries
that are being done by the OS.

For now this'll match up with the "wakeups"; although I'll dig deeper into
this to see if we can determine which sleep state the CPU managed to get
into.  Most things I've seen these days only expose up to C2 or C3 via
ACPI even though the CPU goes all the way down to C6 or C7.
2014-04-08 02:36:27 +00:00
imp
328516a03e Turns out name was used here when ACPI_DEBUG was defined, so refine my
previous patch.
2014-03-31 19:37:39 +00:00
imp
eebc91c3f0 Remove instances of variables that were set, but never used. gcc 4.9
warns about these by default.
2014-03-30 23:43:36 +00:00
jhb
6e6e271c34 Add support for managing PCI bus numbers. As with BARs and PCI-PCI bridge
I/O windows, the default is to preserve the firmware-assigned resources.
PCI bus numbers are only managed if NEW_PCIB is enabled and the architecture
defines a PCI_RES_BUS resource type.
- Add a helper API to create top-level PCI bus resource managers for each
  PCI domain/segment.  Host-PCI bridge drivers use this API to allocate
  bus numbers from their associated domain.
- Change the PCI bus and CardBus drivers to allocate a bus resource for
  their bus number from the parent PCI bridge device.
- Change the PCI-PCI and PCI-CardBus bridge drivers to allocate the
  full range of bus numbers from secbus to subbus from their parent bridge.
  The drivers also always program their primary bus register.  The bridge
  drivers also support growing their bus range by extending the bus resource
  and updating subbus to match the larger range.
- Add support for managing PCI bus resources to the Host-PCI bridge drivers
  used for amd64 and i386 (acpi_pcib, mptable_pcib, legacy_pcib, and qpi_pcib).
- Define a PCI_RES_BUS resource type for amd64 and i386.

Reviewed by:	imp
MFC after:	1 month
2014-02-12 04:30:37 +00:00
jhb
a638f0acf5 Some BIOSes incorrectly use standard memory resource ranges to list
the memory ranges that they decode for downstream devices rather than
creating ResourceProducer range resource entries.  The result is that
we allocate the full range to the PCI root bridge device causing
allocations in child devices to all fail.

As a workaround, ignore any standard memory resources on a PCI root
bridge device.  It is normal for a PCI root bridge to allocate an I/O
resource for the I/O ports used for PCI config access, but I have not
seen any PCI root bridges that legitimately allocate a memory resource.

Reviewed by:	jkim
MFC after:	1 week
2014-01-28 20:53:33 +00:00
eadler
44c01df173 Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this
shifts into the sign bit.  Instead use (1U << 31) which gets the
expected result.

This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.

A similar change was made in OpenBSD.

Discussed with:	-arch, rdivacky
Reviewed by:	cperciva
2013-11-30 22:17:27 +00:00
mav
f6b2403684 Handle case when ACPI reports HPET device, but does not provide memory
resource for it.  In such case take the address range from the HPET table.

This fixes hpet(4) driver attach on Asrock C2750D4I board.
2013-11-15 11:32:19 +00:00
nwhitehorn
3e9a158ce7 More BUS_PROBE_NOWILDCARD sweeping. Some devices here (if_ath_ahb and siba)
resist easy conversion since they implement a great deal of their attach
logic inside probe(). Some of this could be fixed by moving it to attach(),
but some requires something more subtle than BUS_PROBE_NOWILDCARD.
2013-10-29 14:19:42 +00:00
kib
74b8996ebe Import the driver for VT-d DMAR hardware, as specified in the revision
1.3 of Intelб╝ Virtualization Technology for Directed I/O Architecture
Specification.  The Extended Context and PASIDs from the rev. 2.2 are
not supported, but I am not aware of any released hardware which
implements them.  Code does not use queued invalidation, see comments
for the reason, and does not provide interrupt remapping services.

Code implements the management of the guest address space per domain
and allows to establish and tear down arbitrary mappings, but not
partial unmapping.  The superpages are created as needed, but not
promoted.  Faults are recorded, fault records could be obtained
programmatically, and printed on the console.

Implement the busdma(9) using DMARs.  This busdma backend avoids
bouncing and provides security against misbehaving hardware and driver
bad programming, preventing leaks and corruption of the memory by wild
DMA accesses.

By default, the implementation is compiled into amd64 GENERIC kernel
but disabled; to enable, set hw.dmar.enable=1 loader tunable.  Code is
written to work on i386, but testing there was low priority, and
driver is not enabled in GENERIC.  Even with the DMAR turned on,
individual devices could be directed to use the bounce busdma with the
hw.busdma.pci<domain>:<bus>:<device>:<function>.bounce=1 tunable.  If
DMARs are capable of the pass-through translations, it is used,
otherwise, an identity-mapping page table is constructed.

The driver was tested on Xeon 5400/5500 chipset legacy machine,
Haswell desktop and E5 SandyBridge dual-socket boxes, with ahci(4),
ata(4), bce(4), ehci(4), mfi(4), uhci(4), xhci(4) devices.  It also
works with em(4) and igb(4), but there some fixes are needed for
drivers, which are not committed yet.  Intel GPUs do not work with
DMAR (yet).

Many thanks to John Baldwin, who explained me the newbus integration;
Peter Holm, who did all testing and helped me to discover and
understand several incredible bugs; and to Jim Harris for the access
to the EDS and BWG and for listening when I have to explain my
findings to somebody.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
2013-10-28 13:33:29 +00:00
gibbs
a9c07a6f67 Add support for suspend/resume/migration operations when running as a
Xen PVHVM guest.

Submitted by:	Roger Pau Monné
Sponsored by:	Citrix Systems R&D
Reviewed by:	gibbs
Approved by:	re (blanket Xen)
MFC after:	2 weeks

sys/amd64/amd64/mp_machdep.c:
sys/i386/i386/mp_machdep.c:
	- Make sure that are no MMU related IPIs pending on migration.
	- Reset pending IPI_BITMAP on resume.
	- Init vcpu_info on resume.

sys/amd64/include/intr_machdep.h:
sys/i386/include/intr_machdep.h:
sys/x86/acpica/acpi_wakeup.c:
sys/x86/x86/intr_machdep.c:
sys/x86/isa/atpic.c:
sys/x86/x86/io_apic.c:
sys/x86/x86/local_apic.c:
	- Add a "suspend_cancelled" parameter to pic_resume().  For the
	  Xen PIC, restoration of interrupt services differs between
	  the aborted suspend and normal resume cases, so we must provide
	  this information.

sys/dev/acpica/acpi_timer.c:
sys/dev/xen/timer/timer.c:
sys/timetc.h:
	- Don't swap out "suspend safe" timers across a suspend/resume
	  cycle.  This includes the Xen PV and ACPI timers.

sys/dev/xen/control/control.c:
	- Perform proper suspend/resume process for PVHVM:
		- Suspend all APs before going into suspension, this allows us
		  to reset the vcpu_info on resume for each AP.
		- Reset shared info page and callback on resume.

sys/dev/xen/timer/timer.c:
	- Implement suspend/resume support for the PV timer. Since FreeBSD
	  doesn't perform a per-cpu resume of the timer, we need to call
	  smp_rendezvous in order to correctly resume the timer on each CPU.

sys/dev/xen/xenpci/xenpci.c:
	- Don't reset the PCI interrupt on each suspend/resume.

sys/kern/subr_smp.c:
	- When suspending a PVHVM domain make sure there are no MMU IPIs
	  in-flight, or we will get a lockup on resume due to the fact that
	  pending event channels are not carried over on migration.
	- Implement a generic version of restart_cpus that can be used by
	  suspended and stopped cpus.

sys/x86/xen/hvm.c:
	- Implement resume support for the hypercall page and shared info.
	- Clear vcpu_info so it can be reset by APs when resuming from
	  suspension.

sys/dev/xen/xenpci/xenpci.c:
sys/x86/xen/hvm.c:
sys/x86/xen/xen_intr.c:
	- Support UP kernel configurations.

sys/x86/xen/xen_intr.c:
	- Properly rebind per-cpus VIRQs and IPIs on resume.
2013-09-20 05:06:03 +00:00
dumbbell
49a58122d3 acpi_thermal: Warn about insane _TMP temperature only once
A warning is emitted again if the temperature became briefly valid
meanwhile. This avoids spamming the user when the sensor is broken.

Other values (ie. not _TMP) always raise a warning.
2013-08-30 19:21:12 +00:00
jkim
c55a3ec3ad Tidy up global locks for ACPICA. There is no functional change. 2013-08-13 21:34:03 +00:00
jhb
a198a61e54 Workaround some broken BIOSes that specify edge-sensitive but active-low
settings for ACPI-enumerated serial ports by forcing any IRQs that use
an ISA IRQ value with these settings to active-high instead of active-low.

This is known to occur with the BIOS on an Intel D2500CCE motherboard.

Tested by:	Robert Ames <robertames@hotmail.com>, lev
Submitted by:	Juergen Weiss weiss at uni-mainz.de (original patch)
2013-07-16 14:42:16 +00:00
jhb
eaaf7e1bb0 Don't perform the acpi_DeviceIsPresent() check for PCI-PCI bridges. If
we are probing a PCI-PCI bridge it is because we found one by enumerating
the devices on a PCI bus, so the bridge is definitely present.  A few
BIOSes report incorrect status (_STA) for some bridges that claimed they
were not present when in fact they were.

While here, move this check earlier for Host-PCI bridges so attach fails
before doing any work that needs to be torn down.

PR:		kern/91594
Tested by:	Jack Vogel @ Intel
MFC after:	1 week
2013-07-03 17:26:05 +00:00
jkim
d6fbe88f06 Consistently cast ACPICA 64-bit integer types when we print them. 2013-06-26 23:52:10 +00:00
jkim
ca7944e405 Merge ACPICA 20130517. 2013-05-20 23:52:49 +00:00
jkim
51db6f82a1 - Prefer ACPI_COMPARE_NAME(a, b) macro over strncmp(a, b, ACPI_NAME_SIZE).
- Make sure the predefined name is a string type.
- Return slightly more useful errors.
2013-05-20 22:18:18 +00:00
jkim
4a97d198fc Fix white spaces. 2013-05-20 22:10:01 +00:00
jhb
d02e1b3646 - Some BIOSes use an Extended IRQ resource descriptor in _PRS for a link
that uses non-ISA IRQs but use a plain IRQ resource in _CRS.  However,
  a non-ISA IRQ can't fit into a plain IRQ resource.  If we encounter a
  link like this, build the resource buffer from _PRS instead of _CRS.
- Set the correct size of the end tag in a resource buffer.

Tested by:	Benjamin Lee <ben@b1c1l1.com>
MFC after:	2 weeks
2013-04-22 15:51:06 +00:00
rpaulo
afdfb863d6 Fix a typo in a comment. 2013-03-17 07:28:17 +00:00
mav
0b0a5d2fe0 Add "else" missed at r248154. 2013-03-11 17:29:09 +00:00