freebsd-dev

Author	SHA1	Message	Date
Konstantin Belousov	1680854946	Implement userspace gettimeofday(2) with HPET timecounter. Right now, userspace (fast) gettimeofday(2) on x86 only works for RDTSC. For older machines, like Core2, where RDTSC is not C2/C3 invariant, and which fall to HPET hardware, this means that the call has both the penalty of the syscall and of the uncached hw behind the QPI or PCIe connection to the sought bridge. Nothing can me done against the access latency, but the syscall overhead can be removed. System already provides mappable /dev/hpetX devices, which gives straight access to the HPET registers page. Add yet another algorithm to the x86 'vdso' timehands. Libc is updated to handle both RDTSC and HPET. For HPET, the index of the hpet device to mmap is passed from kernel to userspace, index might be changed and libc invalidates its mapping as needed. Remove cpu_fill_vdso_timehands() KPI, instead require that timecounters which can be used from userspace, to provide tc_fill_vdso_timehands{,32}() methods. Merge i386 and amd64 libc/<arch>/sys/__vdso_gettc.c into one source file in the new libc/x86/sys location. __vdso_gettc() internal interface is changed to move timecounter algorithm detection into the MD code. Measurements show that RDTSC even with the syscall overhead is faster than userspace HPET access. But still, userspace HPET is three-four times faster than syscall HPET on several Core2 and SandyBridge machines. Tested by: Howard Su <howard0su@gmail.com> Sponsored by: The FreeBSD Foundation MFC after: 1 month Differential revision: https://reviews.freebsd.org/D7473	2016-08-17 09:52:09 +00:00
Konstantin Belousov	51c762d172	By default, allow all to read the HPET registers pages. At the same time, by, by default disallow writes to the mmaped HPET pages. Intent is to allow userspace to use HPET as fast (i.e. no-syscall) timecounter for gettimeofday(2). Unfortunately, the permission model does not make it possible to safely unhide /dev/hpet in the jails even if default mode is set to 0444, because untrusted jailed root may change device permissions to writeable. Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2016-08-17 09:20:04 +00:00
Jung-uk Kim	0aed566c32	Remove a tunable and always reset system clock while resuming with ACPI. Requested by: bde (long ago)	2016-07-13 19:16:32 +00:00
Conrad Meyer	00ee4cd78a	Fix a minor leak in ACPI thermal Introduced in r301518. Reported by: Coverity CID: 1356266 Sponsored by: EMC / Isilon Storage Division	2016-06-07 19:08:13 +00:00
John Baldwin	9f70e94e68	Defer the creation of ACPI thermal kthreads to a startup sysinit. The SYSINIT runs at SI_SUB_KICK_SCHEDULER after the scheduler is fully initialized and timers are working. This fixes booting in the EARLY_AP_STARTUP case.	2016-06-06 20:28:53 +00:00
Adrian Chadd	85cb8105ab	[acpi] graphics drivers want access to acpi lid handle the graphics drivers can benefit from access to the lid handle for querying and getting notifications Submitted by: kmacy Differential Revision: https://reviews.freebsd.org/D6643	2016-06-05 02:02:51 +00:00
Luiz Otavio O Souza	9d6672e13b	Fix the deciKelvin to Celsius conversion in kernel. After r285994, sysctl(8) was fixed to use 273.15 instead of 273.20 as 0C reference and as result, the temperature read in sysctl(8) now exibits a +0.1C difference. This commit fix the kernel references to match the reference value used in sysctl(8) after r285994. Sponsored by: Rubicon Communications (Netgate)	2016-05-22 13:58:32 +00:00
John Baldwin	6f33eaa5f0	Implement a proper detach method for the PCI-PCI bridge driver. - Add a pcib_detach() function for the PCI-PCI bridge driver. It tears down the NEW_PCIB and hotplug state including destroying resource managers, deleting child devices, and disabling hotplug events. - Add a detach method to the ACPI PCI-PCI bridge driver which calls pcib_detach() and then frees the copy of the _PRT interrupt routing table. - Add a detach method to the PCI-Cardbus bridge driver which frees the PCI bus resources in addition to calling cbb_detach(). - Explicitly clear any pending hotplug events during attach to ensure future events will generate an interrupt. - If a the Command Completed bit is set in the slot status register when the command completion timeout fires, treat it as if the command completed and the completion interrupt was just lost rather than forcing a detach. - Don't wait for a Command Completed notification if Command Completion interrupts are disabled. The spec explicitly says no interrupt is enabled when clearing CCIE, and on my T400 no interrupt is generated when CCIE is changed from cleared to set, either. In addition, the T400 doesn't appear to set the Command Completed bit in the cases where it doesn't generate an interrupt, so don't schedule the timer either. (If the CC bit were always set, one could always set the timer and rely on the logic of treating CC set as a missed interrupt.) Reviewed by: imp (older version) Differential Revision: https://reviews.freebsd.org/D6424	2016-05-20 00:03:22 +00:00
John Baldwin	ce83de10cb	Use polling spin loops for timeouts during early boot. Some ACPI operations such as mutex acquires and event waits accept a timeout. The ACPI OSD layer implements these timeouts by using regular sleep timeouts. However, this doesn't work during early boot before event timers are setup. Instead, use polling combined with DELAY() to spin. This fixes booting on upcoming Intel systems with Kaby Lake processors. Tested by: "Jeffrey E Pieper" <jeffrey.e.pieper@intel.com> Reviewed by: jimharris MFC after: 1 week	2016-05-16 21:33:31 +00:00
John Baldwin	fdce57a042	Add an EARLY_AP_STARTUP option to start APs earlier during boot. Currently, Application Processors (non-boot CPUs) are started by MD code at SI_SUB_CPU, but they are kept waiting in a "pen" until SI_SUB_SMP at which point they are released to run kernel threads. SI_SUB_SMP is one of the last SYSINIT levels, so APs don't enter the scheduler and start running threads until fairly late in the boot. This change moves SI_SUB_SMP up to just before software interrupt threads are created allowing the APs to start executing kernel threads much sooner (before any devices are probed). This allows several initialization routines that need to perform initialization on all CPUs to now perform that initialization in one step rather than having to defer the AP initialization to a second SYSINIT run at SI_SUB_SMP. It also permits all CPUs to be available for handling interrupts before any devices are probed. This last feature fixes a problem on with interrupt vector exhaustion. Specifically, in the old model all device interrupts were routed onto the boot CPU during boot. Later after the APs were released at SI_SUB_SMP, interrupts were redistributed across all CPUs. However, several drivers for multiqueue hardware allocate N interrupts per CPU in the system. In a system with many CPUs, just a few drivers doing this could exhaust the available pool of interrupt vectors on the boot CPU as each driver was allocating N * mp_ncpu vectors on the boot CPU. Now, drivers will allocate interrupts on their desired CPUs during boot meaning that only N interrupts are allocated from the boot CPU instead of N * mp_ncpu. Some other bits of code can also be simplified as smp_started is now true much earlier and will now always be true for these bits of code. This removes the need to treat the single-CPU boot environment as a special case. As a transition aid, the new behavior is available under a new kernel option (EARLY_AP_STARTUP). This will allow the option to be turned off if need be during initial testing. I plan to enable this on x86 by default in a followup commit in the next few days and to have all platforms moved over before 11.0. Once the transition is complete, the option will be removed along with the !EARLY_AP_STARTUP code. These changes have only been tested on x86. Other platform maintainers are encouraged to port their architectures over as well. The main things to check for are any uses of smp_started in MD code that can be simplified and SI_SUB_SMP SYSINITs in MD code that can be removed in the EARLY_AP_STARTUP case (e.g. the interrupt shuffling). PR: kern/199321 Reviewed by: markj, gnn, kib Sponsored by: Netflix	2016-05-14 18:22:52 +00:00
Edward Tomasz Napierala	084d207584	Remove misc NULL checks after M_WAITOK allocations. MFC after: 1 month Sponsored by: The FreeBSD Foundation	2016-05-10 10:26:07 +00:00
John Baldwin	8d791e5af1	Add a new bus method to fetch device-specific CPU sets. bus_get_cpus() returns a specified set of CPUs for a device. It accepts an enum for the second parameter that indicates the type of cpuset to request. Currently two valus are supported: - LOCAL_CPUS (on x86 this returns all the CPUs in the package closest to the device when DEVICE_NUMA is enabled) - INTR_CPUS (like LOCAL_CPUS but only returns 1 SMT thread for each core) For systems that do not support NUMA (or if it is not enabled in the kernel config), LOCAL_CPUS fails with EINVAL. INTR_CPUS is mapped to 'all_cpus' by default. The idea is that INTR_CPUS should always return a valid set. Device drivers which want to use per-CPU interrupts should start using INTR_CPUS instead of simply assigning interrupts to all available CPUs. In the future we may wish to add tunables to control the policy of INTR_CPUS (e.g. should it be local-only or global, should it ignore SMT threads or not). The x86 nexus driver exposes the internal set of interrupt CPUs from the the x86 interrupt code via INTR_CPUS. The ACPI bus driver and PCI bridge drivers use _PXM to return a suitable LOCAL_CPUS set when _PXM exists and DEVICE_NUMA is enabled. They also and the global INTR_CPUS set from the nexus driver with the per-domain set from _PXM to generate a local INTR_CPUS set for child devices. Compared to the r298933, this version uses 'struct _cpuset' in <sys/bus.h> instead of 'cpuset_t' to avoid requiring <sys/param.h> (<sys/_cpuset.h> still requires <sys/param.h> for MAXCPU even though <sys/_bitset.h> does not after recent changes).	2016-05-09 20:50:21 +00:00
John Baldwin	82cb5c3b5b	Native PCI-express HotPlug support. PCI-express HotPlug support is implemented via bits in the slot registers of the PCI-express capability of the downstream port along with an interrupt that triggers when bits in the slot status register change. This is implemented for FreeBSD by adding HotPlug support to the PCI-PCI bridge driver which attaches to the virtual PCI-PCI bridges representing downstream ports on HotPlug slots. The PCI-PCI bridge driver registers an interrupt handler to receive HotPlug events. It also uses the slot registers to determine the current HotPlug state and drive an internal HotPlug state machine. For simplicty of implementation, the PCI-PCI bridge device detaches and deletes the child PCI device when a card is removed from a slot and creates and attaches a PCI child device when a card is inserted into the slot. The PCI-PCI bridge driver provides a bus_child_present which claims that child devices are present on HotPlug-capable slots only when a card is inserted. Rather than requiring a timeout in the RC for config accesses to not-present children, the pcib_read/write_config methods fail all requests when a card is not present (or not yet ready). These changes include support for various optional HotPlug capabilities such as a power controller, mechanical latch, electro-mechanical interlock, indicators, and an attention button. It also includes support for devices which require waiting for command completion events before initiating a subsequent HotPlug command. However, it has only been tested on ExpressCard systems which support surprise removal and have none of these optional capabilities. PCI-express HotPlug support is conditional on the PCI_HP option which is enabled by default on arm64, x86, and powerpc. Reviewed by: adrian, imp, vangyzen (older versions) Relnotes: yes Differential Revision: https://reviews.freebsd.org/D6136	2016-05-05 22:26:23 +00:00
Jung-uk Kim	c19e9bb3b0	Fix intmax_t to uintptr_t casting on 32-bit platforms. Found by GCC. Submitted by: bde	2016-05-05 18:43:31 +00:00
Pedro F. Giffuni	453130d9bf	sys/dev: minor spelling fixes. Most affect comments, very few have user-visible effects.	2016-05-03 03:41:25 +00:00
John Baldwin	8a08b7d36b	Revert bus_get_cpus() for now. I really thought I had run this through the tinderbox before committing, but many places need <sys/types.h> -> <sys/param.h> for <sys/bus.h> now.	2016-05-03 01:17:40 +00:00
John Baldwin	bc153c692f	Add a new bus method to fetch device-specific CPU sets. bus_get_cpus() returns a specified set of CPUs for a device. It accepts an enum for the second parameter that indicates the type of cpuset to request. Currently two valus are supported: - LOCAL_CPUS (on x86 this returns all the CPUs in the package closest to the device when DEVICE_NUMA is enabled) - INTR_CPUS (like LOCAL_CPUS but only returns 1 SMT thread for each core) For systems that do not support NUMA (or if it is not enabled in the kernel config), LOCAL_CPUS fails with EINVAL. INTR_CPUS is mapped to 'all_cpus' by default. The idea is that INTR_CPUS should always return a valid set. Device drivers which want to use per-CPU interrupts should start using INTR_CPUS instead of simply assigning interrupts to all available CPUs. In the future we may wish to add tunables to control the policy of INTR_CPUS (e.g. should it be local-only or global, should it ignore SMT threads or not). The x86 nexus driver exposes the internal set of interrupt CPUs from the the x86 interrupt code via INTR_CPUS. The ACPI bus driver and PCI bridge drivers use _PXM to return a suitable LOCAL_CPUS set when _PXM exists and DEVICE_NUMA is enabled. They also and the global INTR_CPUS set from the nexus driver with the per-domain set from _PXM to generate a local INTR_CPUS set for child devices. Reviewed by: wblock (manpage) Differential Revision: https://reviews.freebsd.org/D5519	2016-05-02 18:00:38 +00:00
John Baldwin	8d07a66d77	Only count CPU devices that are using the ACPI CPU driver. Arguably we should only be doing the probe/attach to children of these devices as well. Tested by: Michal Stanek <mst_semihalf.com> (arm64) Differential Revision: https://reviews.freebsd.org/D6133	2016-04-28 18:53:14 +00:00
John Baldwin	1b424b5655	Adjust prototypes for NUMA-related functions to match the style of the rest of this file.	2016-04-27 21:12:05 +00:00
Jung-uk Kim	f8146b882b	Merge ACPICA 20160422.	2016-04-27 19:09:21 +00:00
John Baldwin	67e7d085ae	Add a pcib_attach_child() method to manage adding the child "pci" device. This allows the PCI-PCI bridge driver to save a reference to the child device in its softc. Note that this required moving the "pci" device creation out of acpi_pcib_attach(). Instead, acpi_pcib_attach() is renamed to acpi_pcib_fetch_prt() as it's sole action now is to fetch the PCI interrupt routing table. Differential Revision: https://reviews.freebsd.org/D6021	2016-04-27 16:39:05 +00:00
John Baldwin	4c26ac696c	Optionally return the output capabilities list from _OSC. Both of the callers were expecting the input cap_set to be modified. This fixes them to request cap_set to be updated with the returned buffer. Reviewed by: jkim Differential Revision: https://reviews.freebsd.org/D6040	2016-04-22 17:51:19 +00:00
John Baldwin	f8887b894c	Queue the CPU-probing task after all acpi_cpu devices are attached. Eventually with earlier AP startup this code will change to call the startup function synchronously instead of queueing the task. Moving the time we queue the task should be a no-op since taskqueue threads don't start executing tasks until much later, but this reduces the diff with the earlier AP startup patches. Sponsored by: Netflix	2016-04-21 18:27:05 +00:00
Jung-uk Kim	e5f7701025	Prefer sizeof(*pointer) over sizeof(type). No funtional change.	2016-04-20 21:30:56 +00:00
Jung-uk Kim	87f0a4bf1f	There is no need to use array any more. No functional change.	2016-04-20 21:26:59 +00:00
Jung-uk Kim	cad6d22280	Remove query flag from acpi_EvaluateOSC(). This function does not support return buffer (yet).	2016-04-20 21:21:47 +00:00
John Baldwin	508c21b669	Invoke _OSC on Host-PCI bridges. Tell the firmware that we support PCI-express config space access and MSI. Reviewed by: jkim MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6023	2016-04-20 20:58:30 +00:00
John Baldwin	5f3dd91a8f	Add a wrapper for evaluating _OSC methods. This wrapper does not translate errors in the first word to ACPI error status returns. Use this wrapper in the acpi_cpu(4) driver in place of the existing _OSC code. While here, fix a bug where the wrong count of words was passed when invoking _OSC. Reviewed by: jkim MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6022	2016-04-20 20:55:58 +00:00
John Baldwin	6cd99ae86d	Add a new PCI bus interface method to alloc the ivars (dinfo) for a device. The ACPI and OFW PCI bus drivers as well as CardBus override this to allocate the larger ivars to hold additional info beyond the stock PCI ivars. This removes the need to pass the size to functions like pci_add_iov_child() and pci_read_device() simplifying IOV and bus rescanning implementations. As a result of this and earlier changes, the ACPI PCI bus driver no longer needs its own device_attach and pci_create_iov_child methods but can use the methods in the stock PCI bus driver instead. Differential Revision: https://reviews.freebsd.org/D5891	2016-04-15 03:42:12 +00:00
John Baldwin	62d70a8174	Add more fine-grained kernel options for NUMA support. VM_NUMA_ALLOC is used to enable use of domain-aware memory allocation in the virtual memory system. DEVICE_NUMA is used to enable affinity reporting for devices such as bus_get_domain(). MAXMEMDOM must still be set to a value greater than for any NUMA support to be effective. Note that 'cpuset -gd' always works if MAXMEMDOM is enabled and the system supports NUMA. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D5782	2016-04-09 13:58:04 +00:00
John Baldwin	652523175b	Associate device_t objects with ACPI handles via PCI_CHILD_ADDED(). Previously, the ACPI PCI bus driver did a single pass over the devices in the namespace that were a child of a given PCI bus to associate the PCI bus-enumerated device_t devices with the corresponding ACPI handles. However, this meant that handles were only established at runtime for devices found during the initial PCI bus scan. PCI_IOV adds devices that show up after the initial PCI bus scan, and coming changes to add a bus rescan can also add devices after the initial scan. This change adds a pci_child_added() callback to the ACPI PCI bus that walks the namespace to find the ACPI handle for each device that is added. Using a callback means that the handle is correctly set for any device no matter how it is added (initial scan, IOV, or a bus rescan).	2016-04-07 17:15:16 +00:00
John Baldwin	496dfa89a6	Convert pci_delete_child() to a bus_child_deleted() method. Instead of providing a wrapper around device_delete_child() that the PCI bus and child bus drivers must call explicitly, move the bulk of the logic from pci_delete_child() into a bus_child_deleted() method (pci_child_deleted()). This allows PCI devices to be safely deleted via device_delete_child(). - Add a bus_child_deleted method to the ACPI PCI bus which clears the device_t associated with the corresponding ACPI handle in addition to the normal PCI bus cleanup. - Change cardbus_detach_card to call device_delete_children() and move CardBus-specific delete logic into a new cardbus_child_deleted() method. - Use device_delete_child() instead of pci_delete_child() in the SRIOV code. - Add a bus_child_deleted method to the OpenFirmware PCI bus drivers which frees the OpenFirmware device info for each PCI device. Reviewed by: imp Tested on: amd64 (CardBus and PCI-e hotplug) Differential Revision: https://reviews.freebsd.org/D5831	2016-04-06 04:10:22 +00:00
Justin Hibbits	da1b038af9	Use uintmax_t (typedef'd to rman_res_t type) for rman ranges. On some architectures, u_long isn't large enough for resource definitions. Particularly, powerpc and arm allow 36-bit (or larger) physical addresses, but type `long' is only 32-bit. This extends rman's resources to uintmax_t. With this change, any resource can feasibly be placed anywhere in physical memory (within the constraints of the driver). Why uintmax_t and not something machine dependent, or uint64_t? Though it's possible for uintmax_t to grow, it's highly unlikely it will become 128-bit on 32-bit architectures. 64-bit architectures should have plenty of RAM to absorb the increase on resource sizes if and when this occurs, and the number of resources on memory-constrained systems should be sufficiently small as to not pose a drastic overhead. That being said, uintmax_t was chosen for source clarity. If it's specified as uint64_t, all printf()-like calls would either need casts to uintmax_t, or be littered with PRI64 macros. Casts to uintmax_t aren't horrible, but it would also bake into the API for resource_list_print_type() either a hidden assumption that entries get cast to uintmax_t for printing, or these calls would need the PRI64 macros. Since source code is meant to be read more often than written, I chose the clearest path of simply using uintmax_t. Tested on a PowerPC p5020-based board, which places all device resources in 0xfxxxxxxxx, and has 8GB RAM. Regression tested on qemu-system-i386 Regression tested on qemu-system-mips (malta profile) Tested PAE and devinfo on virtualbox (live CD) Special thanks to bz for his testing on ARM. Reviewed By: bz, jhb (previous) Relnotes: Yes Sponsored by: Alex Perez/Inertial Computing Differential Revision: https://reviews.freebsd.org/D4544	2016-03-18 01:28:41 +00:00
Justin Hibbits	e8cf8358d0	Remove default initializations for rman, a'la r296331	2016-03-04 01:25:45 +00:00
Jung-uk Kim	2f977ab449	Silence PVS-Studio warning (V595).	2016-02-23 23:09:45 +00:00
Jung-uk Kim	8444599e1e	Silence PVS-Studio warning (V595).	2016-02-23 22:55:44 +00:00
Jung-uk Kim	7054df7f5b	Remove brightness notify handler before reinstalling new one.	2016-02-23 22:50:45 +00:00
Jung-uk Kim	3e4f6cc6c7	Fix white spaces.	2016-02-23 22:30:45 +00:00
Jung-uk Kim	8d4a1dbf7b	Fix style(9) bugs.	2016-02-23 22:22:15 +00:00
Konstantin Belousov	2fe1339ea2	Some BIOSes ACPI bytecode needs to take (sleepable) acpi mutex for acpi_GetInteger() execution. Intel DMAR interrupt remapping code needs to know UID of the HPET to properly route the FSB interrupts from the HPET, even when interrupt remapping is disabled, and the code is executed under some non-sleepable mutexes. Cache HPET UIDs in the device softc at the attach time and provide lock-less method to get UID, use the method from the dmar hpet handling code instead of calling GetInteger(). Reported and tested by: Larry Rosenman <ler@lerctr.org> Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-02-20 13:37:04 +00:00
Konstantin Belousov	94a4ee3be7	Switch /dev/hpet to use make_dev_s(9). Device needs si_drv1 initializated, do it correctly even though hpet cannot be loaded as module. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-02-20 13:21:59 +00:00
Justin Hibbits	7915adb560	Introduce a RMAN_IS_DEFAULT_RANGE() macro, and use it. This simplifies checking for default resource range for bus_alloc_resource(), and improves readability. This is part of, and related to, the migration of rman_res_t from u_long to uintmax_t. Discussed with: jhb Suggested by: marcel	2016-02-20 01:32:58 +00:00
Adrian Chadd	ad73b30173	document some ACPI related sysctls. Submitted by: Oliver Pinter <oliver.pinter@hardenedbsd.org> Sponsored by: HardenedBSD Differential Revision: https://reviews.freebsd.org/D5263	2016-02-19 05:02:17 +00:00
Jung-uk Kim	bfe2514a08	Remove a bogus bzero() call. Found by: PVS-Studio	2016-02-18 23:32:11 +00:00
Justin Hibbits	2dd1bdf183	Convert rman to use rman_res_t instead of u_long Summary: Migrate to using the semi-opaque type rman_res_t to specify rman resources. For now, this is still compatible with u_long. This is step one in migrating rman to use uintmax_t for resources instead of u_long. Going forward, this could feasibly be used to specify architecture-specific definitions of resource ranges, rather than baking a specific integer type into the API. This change has been broken out to facilitate MFC'ing drivers back to 10 without breaking ABI. Reviewed By: jhb Sponsored by: Alex Perez/Inertial Computing Differential Revision: https://reviews.freebsd.org/D5075	2016-01-27 02:23:54 +00:00
Colin Percival	2eb0015ab7	Disable suspend when we're shutting down. This solves the "tell FreeBSD to shut down; close laptop lid" scenario which otherwise tended to end with a laptop overheating or the battery dying. The implementation uses a new sysctl, kern.suspend_blocked; init(8) sets this while rc.suspend runs, and the ACPI sleep code ignores requests while the sysctl is set. Discussed on: freebsd-acpi (35 emails) MFC after: 1 week	2015-10-01 10:52:26 +00:00
Zbigniew Bodek	18c72666ce	Add domain support to PCI bus allocation When the system has more than a single PCI domain, the bus numbers are not unique, thus they cannot be used for "pci" device numbering. Change bus numbers to -1 (i.e. to-be-determined automatically) wherever the code did not care about domains. Reviewed by: jhb Obtained from: Semihalf Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D3406	2015-09-16 23:34:51 +00:00
Jung-uk Kim	70e6ab8f6b	Merge ACPICA 20150818.	2015-08-26 17:13:47 +00:00
Jung-uk Kim	0594dadeb8	Catch up with ACPICA 20150717.	2015-07-22 16:26:17 +00:00
Andrew Turner	617994efc7	Add basic support for ACPI. It splits out the nexus driver to two new drivers, one for fdt, one for acpi. It then uses this to decide if it will use fdt or acpi. The GICv2 (interrupt controller) and Generic Timer drivers have been updated to handle both cases. As this is early code we still need FDT to find the kernel console, and some parts are still missing, including PCI support. Differential Revision: https://reviews.freebsd.org/D2463 Reviewed by: jhb, jkim, emaste Obtained from: ABT Systems Ltd Relnotes: Yes Sponsored by: The FreeBSD Foundation	2015-06-11 15:45:33 +00:00
Jung-uk Kim	9ae8e0064a	Check status of AcpiReadBitRegister() calls. Reported by: Coverity CID: 1306132	2015-06-09 23:13:37 +00:00
Jung-uk Kim	fd90e2ed54	CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten years for head. However, it is continuously misused as the mpsafe argument for callout_init(9). Deprecate the flag and clean up callout_init() calls to make them more consistent. Differential Revision: https://reviews.freebsd.org/D2613 Reviewed by: jhb MFC after: 2 weeks	2015-05-22 17:05:21 +00:00
Jung-uk Kim	9cf4cabed7	Do not probe Intel PIIX4 south bridge quirks on amd64. These quirky south bridges only supported Intel Pentium and Pentium II era processors and there is no reason for hardware virtualizations to emulate these quirks. MFC after: 1 week	2015-05-21 19:31:10 +00:00
Andrew Turner	044a49cd24	Hide code only used on i386 and amd64.	2015-05-11 14:36:34 +00:00
Konstantin Belousov	b57a73f8e7	If x86 CPU implementation of the MWAIT instruction reasonably interacts with interrupts, query ACPI and use MWAIT for entrance into Cx sleep states. Support C1 "I/O then halt" mode. See Intel' document 302223-007 "Intelб╝ Processor Vendor-Specific ACPI Interface Specification" for description. Move the acpi_cpu_c1() function into x86/cpu_machdep.c and use it instead of inlining "sti; hlt" sequence in several places. In the acpi(4) man page, besides documenting the dev.cpu.N.cx_methods sysctl, correct the names for dev.cpu.N.{cx_usage,cx_lowest,cx_supported} sysctls. Both jkim and avg have some other patches implementing the mwait functionality; this work is unrelated. Linux does not rely on the ACPI to provide correct tables describing Cx modes. Instead, the driver has pre-defined knowledge of the CPU models, it was supplied by Intel. Tested by: pho (previous versions) Sponsored by: The FreeBSD Foundation	2015-05-09 12:28:48 +00:00
Andrew Turner	fe5d5d7d05	AcpiGbl_FACS will not be defined when building using the reduced hardware model. This may be the case on ARM.	2015-05-06 14:14:14 +00:00
Andrew Turner	ef74544931	If the power management timer is unsupported the PmTimerLength value will be zero.	2015-05-06 14:09:54 +00:00
Andrew Turner	d8c37c3a16	There may not be an FACS table, check for this before accessing it. Sponsored by: The FreeBSD Foundation	2015-04-28 16:06:58 +00:00
Adrian Chadd	5d18c60a93	Refactor out the _PXM -> VM domain lookup done in ACPI, in preparation for its use in upcoming code. This is inspired by something in jhb's NUMA IRQ allocation patchset. However, the tricky bit here is that the PXM lookup for a node may fail, requiring a lookup on the parent node. So if it doesn't exist, don't fail - just go up to the parent. Only error out of the lookup is the ACPI lookup returns an error. Sponsored by: Norse Corp, Inc.	2015-04-19 17:15:55 +00:00
Konstantin Belousov	4f1c158445	Define capabilities bits from the revision 007 of the document 302223 "Intelб╝ Processor Vendor-Specific ACPI Interface Specification", issied Dec 2014. Previous revision 005 was from Sep 2006. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-04-12 10:28:15 +00:00
Jung-uk Kim	7cf3e94a41	Merge ACPICA 20150410.	2015-04-11 03:23:41 +00:00
John Baldwin	44947d3aeb	Move the message complaining about failed system resource allocations under bootverbose. Every example I've seen to date has been due to an ACPI system resource device reserving a range that overlaps with system memory (which ram0 attempts to reserve) or a local or I/O APIC (which apic0 attempts to reserve). These are always harmless but look scary to users. MFC after: 1 week	2015-04-06 17:39:36 +00:00
John Baldwin	15ae88baaa	Fix a typo.	2015-03-06 20:53:56 +00:00
Ryan Stone	9bfb1e36d9	Implement interface to create SR-IOV Virtual Functions Implement the interace to create SR-IOV Virtual Functions (VFs). When a driver registers that they support SR-IOV by calling pci_setup_iov(), the SR-IOV code creates a new node in /dev/iov for that device. An ioctl can be invoked on that device to create VFs and have the driver initialize them. At this point, allocating memory I/O windows (BARs) is not supported. Differential Revision: https://reviews.freebsd.org/D76 Reviewed by: jhb MFC after: 1 month Sponsored by: Sandvine Inc.	2015-03-01 00:40:09 +00:00
Konstantin Belousov	d19b0f3ea5	Array cannot be NULL, remove always true comparision. ACPI spec identifies the tested condition for _PRT as "BYTE value of 0", so the remaining part of the conditionals is sufficient. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-02-16 22:18:43 +00:00
John Baldwin	64de80195b	Add a new device control utility for new-bus devices called devctl. This allows the user to request administrative changes to individual devices such as attach or detaching drivers or disabling and re-enabling devices. - Add a new /dev/devctl2 character device which uses ioctls for device requests. The ioctls use a common 'struct devreq' which is somewhat similar to 'struct ifreq'. - The ioctls identify the device to operate on via a string. This string can either by the device's name, or it can be a bus-specific address. (For unattached devices, a bus address is the only way to locate a device.) Bus drivers register an eventhandler to claim unrecognized device names that the driver recognizes as a valid address. Two buses currently support addresses: ACPI recognizes any device in the ACPI namespace via its full path starting with "\" and the PCI bus driver recognizes an address specification of 'pci[<domain>:]<bus>:<slot>:<func>' (identical to the PCI selector strings supported by pciconf). - To make it easier to cut and paste, change the PnP location string in the PCI bus driver to output a full PCI selector string rather than 'slot=<slot> function=<func>'. - Add a devctl(3) interface in libdevctl which provides a wrapper around the ioctls and is the preferred interface for other userland code. - Add a devctl(8) program which is a simple wrapper around the requests supported by devctl(3). - Add a device_is_suspended() function to check DF_SUSPENDED. - Add a resource_unset_value() function that can be used to remove a hint from the kernel environment. This is used to clear a hint.<driver>.<unit>.disabled hint when re-enabling a boot-time disabled device. Reviewed by: imp (parts) Requested by: imp (changing PCI location string) Relnotes: yes	2015-02-06 16:09:01 +00:00
Andriy Gapon	85f95fffd7	hook userland threads suspend + resume into acpi suspend code Also, split power_suspend into power_suspend and power_suspend_early. power_suspend_early is called before the userland is frozen. power_suspend is called after the userland is frozen. Currently only VT switching is hooked to power_suspend_early. This is needed because switching away from X server requires its cooperation, so obviously X server must not be frozen when that happens. Freezing userland during ACPI suspend is useful because not all drivers correctly handle suspension concurrent with other activity. This is especially applicable to drivers ported from other operating systems that suspend all software activity between placing drivers and hardware into suspended state. In particular drm2/radeon (radeonkms) depends on the described procedure. The driver does not have any internal synchronization between suspension activities and processing of userland requests. Many thanks to kib for the code that allows to freeze and thaw all userland threads. Note that ideally we also need to park / inhibit (non-special) kernel threads as well to ensure that they do not call into drivers. MFC after: 17 days	2015-01-27 17:33:18 +00:00
Jung-uk Kim	ae1ef64fe4	Simplify retry loops. No functional change.	2015-01-23 18:55:04 +00:00
Jung-uk Kim	4552fccb4b	Revert r216942. This commit was premature and caused too many complaints. PR: 162859 MFC after: 3 days	2015-01-23 18:12:44 +00:00
Colin Percival	633a28478c	When disabling C3+ CPU states due to the CPU_QUIRK_NO_C3 quirk, don't accidentally enable non-existent states. This bug was triggered if ACPI advertises the presence of a C2 state which we fail to parse via acpi_PkgGas due to our lack of support for FFixedHW resources, and causes an immediate panic when an attempt is made to enter the (NULL) state. One affected platform is the EC2 c4.8xlarge VM instance type; there may be others. MFC after: 1 week Thanks to: jkim, @_msw_	2015-01-18 12:45:26 +00:00
John Baldwin	92597e064b	On some Intel CPUs with a P-state but not C-state invariant TSC the TSC may also halt in C2 and not just C3 (it seems that in some cases the BIOS advertises its C3 state as a C2 state in _CST). Just play it safe and disable both C2 and C3 states if a user forces the use of the TSC as the timecounter on such CPUs. PR: 192316 Differential Revision: https://reviews.freebsd.org/D1441 No objection from: jkim MFC after: 1 week	2015-01-05 20:44:44 +00:00
Jung-uk Kim	44aba0f6c2	Use the correct device. Note this commit complements r274386. PR: 194884	2014-11-11 19:42:10 +00:00
Adrian Chadd	2da2ade021	Use the correct device (child) when asking the bus layer about which power state said device should go into. This was a snafu introduced in the ACPI/PCI awareness separation. When putting a device into a power state, the bus (and thus firmware, eg ACPI) should be asked before hand to check whether the device can indeed go into that power state. There's a set of nodes in ACPI under each device - the _SxD nodes - which state which ACPI power state to put the device into when the system is going into power save state 'x'. So when going into S3, the existence of an _S3D node would override whatever the system was trying to do. By default the PCI code wants to put devices into D3 before suspending. I have a laptop here (Asus Zenbook - check the PR) whose EHCI controller really wants to be in D2 during suspend, not D3. So if we put it into D3 and then try to enter S3, everything hangs. The device itself can go into D3 - it just can't be there when the call to ACPI to enter S3 occurs. The PCI patch fixes this. jkim@ noticed that the same is needed for the ACPI child device enumeration. Thankyou to Matt Dillon (the programmer, not the actor) for buying me this particular laptop so I could debug the issues with the Atheros AR9485 that is in it. It's his fault that I ended up with this laptop and was sufficiently annoyed by the lack of USB suspend to go down this rabbit hole. Tested: * Thinkpad T400 * Thinkpad X230 * Thinkpad T42 * Thinkpad T60 * Asus Zenbook (see PR) * Asus EEEPC 701 * Asus EEEPC 1001PX TODO: * Figure out what we should do about devices we unload drivers for that want to be in a specific state when entering S3 / S4 - the "put devices into D3 if they're not bound to a driver" option may also mess with things. PR: kern/194884 Reviewed by: jhb, jkim MFC after: 1 week Relnotes: yes Sponsored by: Matt Dillon <dillon@apollo.backplane.com> (hardware)	2014-11-11 17:14:11 +00:00
Hans Petter Selasky	0e1152fcc2	The SYSCTL data pointers can come from userspace and must not be directly accessed. Although this will work on some platforms, it can throw an exception if the pointer is invalid and then panic the kernel. Add a missing SYSCTL_IN() of "SCTP_BASE_STATS" structure. MFC after: 3 days Sponsored by: Mellanox Technologies	2014-10-28 12:00:39 +00:00
Konstantin Belousov	3f6732e8e0	Set the caching mode for the usermode mapping of the HPET registers page to uncached. Reviewed by: rpaulo Sponsored by: The FreeBSD Foundation MFC after: 1 week	2014-10-25 21:01:50 +00:00
Rui Paulo	f53389946d	Add a sysctl to control the HPET allow_write behaviour. Requested by: kib	2014-10-24 21:08:36 +00:00
Rui Paulo	86b19a6984	HPET: avoid handling the multiple file-descriptor case. It had two bugs: one where mmap was still allowed and another where D_TRACKCLOSE doesn't handle all cases. Thanks to jhb and kib for pointing them out. MFC after: 1 week	2014-10-24 19:58:00 +00:00
Rui Paulo	3149cc9df6	HPET: create /dev/hpetN as a way to access HPET from userland. In some cases, TSC is broken and special applications might benefit from memory mapping HPET and reading the registers to count time. Most often the main HPET counter is 32-bit only[1], so this only gives the application a 300 second window based on the default HPET interval. Other applications, such as Intel's DPDK, expect /dev/hpet to be present and use it to count time as well. Although we have an almost userland version of gettimeofday() which uses rdtsc in userland, it's not always possible to use it, depending on how broken the multi-socket hardware is. Install the acpi_hpet.h so that applications can use the HPET register definitions. [1] I haven't found a system where HPET's main counter uses more than 32 bit. There seems to be a discrepancy in the Intel documentation (claiming it's a 64-bit counter) and the actual implementation (a 32-bit counter in a 64-bit memory area). MFC after: 1 week Relnotes: yes	2014-10-24 18:39:15 +00:00
Davide Italiano	2be111bf7d	Follow up to r225617. In order to maximize the re-usability of kernel code in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv(). This fixes a namespace collision with libc symbols. Submitted by: kmacy Tested by: make universe	2014-10-16 18:04:43 +00:00
Adrian Chadd	ffcf962dab	Add a bus method to fetch the VM domain for the given device/bus. * Add a bus_if.m method - get_domain() - returning the VM domain or ENOENT if the device isn't in a VM domain; * Add bus methods to print out the domain of the device if appropriate; * Add code in srat.c to save the PXM -> VM domain mapping that's done and expose a function to translate VM domain -> PXM; * Add ACPI and ACPI PCI methods to check if the bus has a _PXM attribute and if so map it to the VM domain; * (.. yes, this works recursively.) * Have the pci bus glue print out the device VM domain if present. Note: this is just the plumbing to start enumerating information - it doesn't at all modify behaviour. Differential Revision: D906 Reviewed by: jhb Sponsored by: Norse Corp	2014-10-09 05:33:25 +00:00
Jung-uk Kim	313a0c13ef	Merge ACPICA 20140926.	2014-10-02 19:11:18 +00:00
Will Andrews	bb6b32dd81	Add sysctl to track the resource consumption of ACPI interrupts. Submitted by: gibbs MFC after: 1 month Sponsored by: Spectra Logic MFSpectraBSD: 636827 on 2012/09/28	2014-10-01 14:35:52 +00:00
Roger Pau Monné	44e06d158a	msi: add Xen MSI implementation This patch adds support for MSI interrupts when running on Xen. Apart from adding the Xen related code needed in order to register MSI interrupts this patch also makes the msi_init function a hook in init_ops, so different MSI implementations can have different initialization functions. Sponsored by: Citrix Systems R&D xen/interface/physdev.h: - Add the MAP_PIRQ_TYPE_MULTI_MSI to map multi-vector MSI to the Xen public interface. x86/include/init.h: - Add a hook for setting custom msi_init methods. amd64/amd64/machdep.c: i386/i386/machdep.c: - Set the default msi_init hook to point to the native MSI initialization method. x86/xen/pv.c: - Set the Xen MSI init hook when running as a Xen guest. x86/x86/local_apic.c: - Call the msi_init hook instead of directly calling msi_init. xen/xen_intr.h: x86/xen/xen_intr.c: - Introduce support for registering/releasing MSI interrupts with Xen. - The MSI interrupts will use the same PIC as the IO APIC interrupts. xen/xen_msi.h: x86/xen/xen_msi.c: - Introduce a Xen MSI implementation. x86/xen/xen_nexus.c: - Overwrite the default MSI hooks in the Xen Nexus to use the Xen MSI implementation. x86/xen/xen_pci.c: - Introduce a Xen specific PCI bus that inherits from the ACPI PCI bus and overwrites the native MSI methods. - This is needed because when running under Xen the MSI messages used to configure MSI interrupts on PCI devices are written by Xen itself. dev/acpica/acpi_pci.c: - Lower the quality of the ACPI PCI bus so the newly introduced Xen PCI bus can take over when needed. conf/files.i386: conf/files.amd64: - Add the newly created files to the build process.	2014-09-30 16:46:45 +00:00
John Baldwin	fadf3fb98c	Convert from timeout(9) to callout(9).	2014-09-22 14:27:26 +00:00
Adrian Chadd	9cf5a6aad8	Populate the device info string with _PXM (proximity domain) information. This is primarily useful for debugging right now - it'll show up in devinfo. Reviewed by: jhb	2014-09-20 04:31:12 +00:00
John Baldwin	1f895058e4	Revert unrelated changes accidentally committed in r271192.	2014-09-17 18:55:39 +00:00
John Baldwin	b1d735ba4c	Create a separate structure for per-CPU state saved across suspend and resume that is a superset of a pcb. Move the FPU state out of the pcb and into this new structure. As part of this, move the FPU resume code on amd64 into a C function. This allows resumectx() to still operate only on a pcb and more closely mirrors the i386 code. Reviewed by: kib (earlier version)	2014-09-06 15:23:28 +00:00
Neel Natu	3c6f0322bb	Fix typo when displaying the HPET timer unit number.	2014-08-13 00:18:16 +00:00
Roger Pau Monné	c2641d23e1	xen: add ACPI bus to xen_nexus when running as Dom0 Also disable a couple of ACPI devices that are not usable under Dom0. To this end a couple of booleans are added that allow disabling ACPI specific devices. Sponsored by: Citrix Systems R&D Reviewed by: jhb x86/xen/xen_nexus.c: - Return BUS_PROBE_SPECIFIC in the Xen Nexus attachement routine to force the usage of the Xen Nexus. - Attach the ACPI bus when running as Dom0. dev/acpica/acpi_cpu.c: dev/acpica/acpi_hpet.c: dev/acpica/acpi_timer.c - Add a variable that gates the addition of the devices. x86/include/init.h: - Declare variables that control the attachment of ACPI cpu, hpet and timer devices.	2014-08-04 09:05:28 +00:00
Marcel Moolenaar	e7d939bda2	Remove ia64. This includes: o All directories named ia64 o All files named ia64 o All ia64-specific code guarded by __ia64__ o All ia64-specific makefile logic o Mention of ia64 in comments and documentation This excludes: o Everything under contrib/ o Everything under crypto/ o sys/xen/interface o sys/sys/elf_common.h Discussed at: BSDcan	2014-07-07 00:27:09 +00:00
Hans Petter Selasky	af3b2549c4	Pull in r267961 and r267973 again. Fix for issues reported will follow.	2014-06-28 03:56:17 +00:00
Glen Barber	37a107a407	Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory	2014-06-27 22:05:21 +00:00
Hans Petter Selasky	3da1cf1e88	Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies	2014-06-27 16:33:43 +00:00
Hans Petter Selasky	1ccbb263b5	Remove not needed initialisation code.	2014-06-26 10:48:01 +00:00
John Baldwin	d2dc06ca16	Expand r261243 even further and ignore any I/O port resources assigned to PCI root bridges except for the one known-valid case on x86 where bridges claim the I/O port registers used for PCI config space access. Tested by: Hilko Meyer <hilko.meyer@gmx.de> MFC after: 1 week	2014-06-25 20:30:47 +00:00
John Baldwin	2f951d2989	Trust the state of a power resource that get from a working _STA method instead of trying to cache it. Previously, we only trusted the state if we did not have a cached state. However, once a state was cached, the _STA method was always ignored. Specifically, once a power resource had been turned on once (e.g. during resume), the driver assumed it was always on even if _STA said it was off and never turned it back on. This prevented the power resource from being turned back on if a laptop was resumed twice, for example. To fix, just remove the cached state entirely and always use the results of _STA. The loops already skip any resources where _STA fails. Submitted by: trasz (initial patch to invoke _ON) MFC after: 1 week	2014-06-19 18:35:14 +00:00
Steven Hartland	1ddd62d0a9	Remove duplicate SYSCTL_DECL(_debug_acpi) which was breaking tinderbox MFC after: 2 weeks X-MFC-With: r264849	2014-04-24 14:58:12 +00:00
Steven Hartland	802d215dc1	Increase ACPI_MAX_TASKS to be 4 x the number of CPU's as 2 x was still insufficient on some machines MFC after: 2 weeks	2014-04-24 12:38:07 +00:00
Steven Hartland	c9c13b5b77	Exposed debug.acpi.max_tasks and debug.acpi.max_threads via sysctls so their values can be viewed.	2014-04-24 00:41:02 +00:00
Adrian Chadd	0d4700544f	Add a basic set of data points which count the number of sleep entries that are being done by the OS. For now this'll match up with the "wakeups"; although I'll dig deeper into this to see if we can determine which sleep state the CPU managed to get into. Most things I've seen these days only expose up to C2 or C3 via ACPI even though the CPU goes all the way down to C6 or C7.	2014-04-08 02:36:27 +00:00

1 2 3 4 5 ...

1273 Commits