Commit Graph

49 Commits

Author SHA1 Message Date
Yuri Pankov
41062d6878 hwpstate_intel: don't unconditionally print the error message
Actually check the wrmsr_safe() return value when setting autonomous
HWP for package.

PR:		245582
Differential Revision:	https://reviews.freebsd.org/D24744
2020-11-29 01:43:04 +00:00
Mateusz Guzik
ab6c81a218 x86: clean up empty lines in .c and .h files 2020-09-01 21:23:59 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Conrad Meyer
e656fa70dc hwpstate_intel(4): Save admin-set EPP/EPB and restore after suspend 2020-02-01 20:12:02 +00:00
Conrad Meyer
c93eb470b4 hwpstate_intel(4): Print failure message only on failure
X-MFC-With: r357379
2020-02-01 20:11:25 +00:00
Conrad Meyer
cd4e43b27d hwpstate_intel(4): Detect and support PKG variant
If package-level control is present, we default to using it.  Per-core
software control may be enabled by setting the machdep.hwpstate_pkg_ctrl
tunable to "0" in loader.conf(5).
2020-02-01 19:50:10 +00:00
Conrad Meyer
556a1a0bc6 hwpstate_intel(4): Add fallback EPP using PERF_BIAS MSR
Per Intel SDM (Vol 3b Part 2), if HWP indicates EPP (energy-performance
preference) is not supported, the hardware instead uses the ENERGY_PERF_BIAS
MSR.  In the epp sysctl handler, fall back to that MSR if HWP does not
support EPP and CPUID indicates the ENERGY_PERF_BIAS MSR is supported.
2020-02-01 19:49:13 +00:00
Conrad Meyer
5e3574c8cd x86: Add/amend some power-management comments/macros
No functional change.
2020-02-01 19:46:02 +00:00
Conrad Meyer
b80d476c3c hwpstate_intel(4): Error check epp sysctl & bail if HW does not support feature 2020-02-01 19:45:27 +00:00
Conrad Meyer
f591c3c847 intel_hwpstate(4): Use identcpu-cached cpuid 6 leaf
No functional change.
2020-02-01 17:54:46 +00:00
Conrad Meyer
351896d372 intel_hwpstate(4): Don't leak bound thread in error conditions
I don't know why a Skylake CPU with the HWP feature bit present would trap
on MSR reads of the HWP registers, but if this occurs, do not leave the
attach thread bound.  This could conceivably cause reported hangs, although
I have no evidence that this is the cause.

Reported by:	ae@, Andreas Nilsson <andrnils AT gmail.com>
X-MFC-With:	r357002
2020-02-01 17:30:45 +00:00
Conrad Meyer
43524989c5 hwpstate(4): Ignore CurPstateLimit by default
Add a sysctl knob to allow users to re-enable it, and document the knob and
default in cpufreq.4.  (While here, add a few unrelated updates to
cpufreq.4.)

It seems that the register value in some hardware simply reflects the
configured P-state.  This results in an inadvertent and unintended outcome
where the P-state can only walk down, and then the driver becomes "stuck" in
the slowest possible P-state.

The Linux driver never consults this register, so that's some evidence that
ignoring the contents are relatively harmless.

PR:		234733
Reported by:	sigsys AT gmail.com, Erich Dollanksy <freebsd.ed.lists AT
		sumeritec.com>
2020-01-31 17:40:41 +00:00
Conrad Meyer
07a65f9d38 hwpstate_intel(4): Silence/fix Coverity reports
These were all introduced in the initial import of hwpstate_intel(4).

Reported by:	Coverity
CIDs:		1413161, 1413164, 1413165, 1413167
X-MFC-With:	r357002
2020-01-29 03:15:34 +00:00
Conrad Meyer
9ea85092d9 hwpstate(4): Log a debug line when throttled
If we're going to throttle user requested P-states, we should at least produce
a debug log line indicating the surprising behavior.

PR:		inspired by 234733
2020-01-27 06:04:32 +00:00
Conrad Meyer
08a220dd79 cpufreq(4): Fix missing MODULE_DEPEND on hwpstate_intel
DRIVER_MODULE does not actually define a MODULE_VERSION, which is required
to satisfy a MODULE_DEPENDency.  Declare one explicitly in
hwpstate_intel(4).

Reported by:	flo
X-MFC-With:	r357002
2020-01-23 23:52:57 +00:00
Cy Schubert
ca9fb12a0b Fix 32-bit build post r357002. 2020-01-23 03:38:41 +00:00
Conrad Meyer
4577cf3744 cpufreq(4): Add support for Intel Speed Shift
Intel Speed Shift is Intel's technology to control frequency in hardware,
with hints from software.

Let's get a working version of this in the tree and we can refine it from
here.

Submitted by:	bwidawsk, scottph
Reviewed by:	bcr (manpages), myself
Discussed with:	jhb, kib (earlier versions)
With feedback from:	Greg V, gallatin, freebsdnewbie AT freenet.de
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D18028
2020-01-22 23:28:42 +00:00
Konstantin Belousov
2ee49fac82 Add support for Hygon Dhyana Family 18h processor.
As a new x86 CPU vendor, Chengdu Haiguang IC Design Co., Ltd (Hygon)
is a joint venture between AMD and Haiguang Information Technology Co.,
Ltd., aims at providing x86 processors for China server market.

The first generation Hygon processor(Dhyana) shares most architecture
with AMD's family 17h, but with different CPU vendor ID("HygonGenuine")
and PCI vendor ID(0x1d94) and family series number 18h(Hygon negotiated
with AMD to confirm that only Hygon use family 18h).

To enable Hygon Dhyana support in FreeBSD, add new definitions
HYGON_VENDOR_ID("HygonGenuine") and X86_VENDOR_HYGON(0x1d94) to identify
Hygon Dhyana CPU.

Initialize the CPU features(topology, local APIC ext, MSI, TSC, hwpstate,
MCA, DEBUG_CTL, etc) for amd64 and i386 mode by sharing the code path of
AMD family 17h.

The changes have been applied on FreeBSD 13.0-CURRENT and tested
successfully on Hygon Dhyana processor.

References:
[1] Linux kernel patches for Hygon Dhyana, merged in 4.20:

https://git.kernel.org/tip/c9661c1e80b609cd038db7c908e061f0535804ef

[2] MSR and CPUID definition:

https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf

Submitted by:	Pu Wen <puwen@hygon.cn>
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D23163
2020-01-21 13:22:35 +00:00
Conrad Meyer
e2e050c8ef Extract eventfilter declarations to sys/_eventfilter.h
This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.

EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).

As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions.  The remainder of the patch addresses
adding appropriate includes to fix those files.

LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).

No functional change (intended).  Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed.  __FreeBSD_version has been bumped.
2019-05-20 00:38:23 +00:00
Conrad Meyer
f6e61711ed cpufreq: Remove error-prone table terminators in favor of automatic sizing
PR:		227388
Reported by:	Vladimir Machulsky <xdelta AT meta.ua>
Sponsored by:	Dell EMC Isilon
2018-04-14 03:15:05 +00:00
Pedro F. Giffuni
ac2fffa4b7 Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by:	wosch
PR:		225197
2018-01-21 15:42:36 +00:00
Pedro F. Giffuni
74641f0bc6 x86: make some use of mallocarray(9).
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.

This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.

X-Differential revision: https://reviews.freebsd.org/D13837
2018-01-15 21:08:22 +00:00
Jung-uk Kim
82f0844956 Properly skip the first CPU. It only accidentally worked because the
CPU_FOREACH() loop always starts from BSP (cpu0) and the if condition
is always false for APs.

Reported by:	cem
2017-11-30 20:21:42 +00:00
Jung-uk Kim
e374a321fe Add a tunable "debug.hwpstate_verify" to check P-state after changing it and
turn it off by default.  It is very inefficient to verify current P-state of
each core, especially for CPUs with many cores.  When multiple commands are
requested to the same power domain before completion of pending transitions,
the last command is executed according to the manual.  Because requests are
serialized by the caller, all cores will receive the same command for each
call.  Do not call sched_bind() and sched_unbind().  It is redundant because
the caller does it anyway.
2017-11-30 01:40:07 +00:00
Jung-uk Kim
72b27e9773 Fix style(9). 2017-11-29 23:52:31 +00:00
Pedro F. Giffuni
ebf5747bdb sys/x86: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 15:11:47 +00:00
Conrad Meyer
2e81566368 cpufreq(4) hwpstate: Yield CPU awaiting frequency change
It doesn't seem necessary to busy the CPU while waiting to transition
into a different p-state.

PR:		221621 (related, but does not completely address)
Reviewed by:	truckman
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12260
2017-09-07 20:20:12 +00:00
Conrad Meyer
c768afe370 hwpstate: Add support for family 17h pstate info from MSRs
This information is normally available via acpi_perf, but in case it is not,
add support for fetching the information via MSRs on AMD family 17h (Zen)
processors.  Zen uses a slightly different formula than previous generation
AMD CPUs.

This was inspired by, but does not fix, PR 221621.

Reported by:	Sean P. R. <seanpr AT swbell.net>
Reviewed by:	mjoras@
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12082
2017-08-20 00:41:49 +00:00
Alexey Dokuchaev
a48f5e1ffa - Mention mismatching numbers in MSR vs. ACPI _PSS count warning: seeing
actual numbers would help debugging (also, `MSR' and `ACPI' are standard
  abbreviations and thus should be properly capitalized)
- Rephrase unsupported AMD CPUs message and wrap as an overly long line:
  `sorry' 1) is wrongly spelled after period (starts with a small letter)
  and 2) carries emotional "tinge" that is unnecessary and even bogus in
  debug message; `implemented' is not the best word as `supported' suits
  better in this context
- Improve readability when reporting resulted P-state transition (debug)

Approved by:	jhb
2016-12-01 14:31:05 +00:00
John Baldwin
7b64a80b55 Add powerd(8) support for several families of AMD CPUs.
Use the same logic to calculate the nominal CPU frequency from the P-state
MSRs on family 0x12, 0x15, and 0x16 CPUs as is used for family 0x10.
Family 0x14 was included in the original patch in the PR but I left that
out as the BIOS writer's guide for family 0x14 CPUs show a different layout
for the relevant MSR and include a different formulate for calculating the
frequency.

While here, simplify a few expressions and print out the family of
unsupported CPUs in hex rather than decimal.

PR:		212020
Submitted by:	Anthony Jenkins <Scoobi_doo@yahoo.com>
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D7587
2016-10-27 21:31:56 +00:00
Pedro F. Giffuni
a061aa46fe sys: replace comma with semicolon when pertinent.
Uses of commas instead of a semicolons can easily go undetected. The comma
can serve as a statement separator but this shouldn't be abused when
statements are meant to be standalone.

Detected with devel/coccinelle following a hint from DragonFlyBSD.

MFC after:	1 month
2016-08-09 19:42:20 +00:00
Pedro F. Giffuni
74b8d63dcc Cleanup unnecessary semicolons from the kernel.
Found with devel/coccinelle.
2016-04-10 23:07:00 +00:00
Hans Petter Selasky
af3b2549c4 Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
Glen Barber
37a107a407 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
Hans Petter Selasky
3da1cf1e88 Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
Alexander Motin
9d75ca28f0 Do not DELAY() for P-state transition unless we want to see the result.
Intel manual says: "If a transition is already in progress, transition to
a new value will subsequently take effect. Reads of IA32_PERF_CTL determine
the last targeted operating point."  So seems it should be fine to just
trigger wanted transition and go.  Linux does the same.

MFC after:	1 month
2013-12-10 20:25:43 +00:00
Sean Bruno
ded71d6788 Fix powerd/states on AMD cpus. Resolves issues with system reporting:
hwpstate0: set freq failed, err 6

Tested on FX-8150 and others.

PR:		167018
Submitted by:	avg
MFC after:	2 weeks
2013-11-06 23:29:25 +00:00
Hiren Panchasara
46b29ff94a Adding a detach method to p4tcc driver.
PR:	118739
Submitted by:	Dan Lukes <dan@obluda.cz> (earlier version)
Reviewed by:	jhb
Approved by:	sbruno (mentor)
MFC after:	1 week
2013-05-10 22:43:27 +00:00
Eitan Adler
a8de37b024 This isn't functionally identical. In some cases a hint to disable
unit 0 would in fact disable all units.

This reverts r241856

Approved by: cperciva (implicit)
2012-10-22 13:06:09 +00:00
Eitan Adler
76b7512247 Now that device disabling is generic, remove extraneous code from the
device drivers that used to provide this feature.

Reviewed by:	des
Approved by:	cperciva
MFC after:	1 week
2012-10-22 03:41:14 +00:00
Alexander Kabaev
2f42a9bf0d Fix apparent logic reversal in setting the 'auto_mode' flag.
MFC after: 2 weeks
2012-02-26 21:24:27 +00:00
Jung-uk Kim
43d645f96b Use ACPI-supplied CPU frequencies instead of estimated ones as we are about
to use other values from the same table anyway.

MFC after:	3 days
2011-04-27 00:32:35 +00:00
Jung-uk Kim
3453537fa5 Use atomic load & store for TSC frequency. It may be overkill for amd64 but
safer for i386 because it can be easily over 4 GHz now.  More worse, it can
be easily changed by user with 'machdep.tsc_freq' tunable (directly) or
cpufreq(4) (indirectly).  Note it is intentionally not used in performance
critical paths to avoid performance regression (but we should, in theory).
Alternatively, we may add "virtual TSC" with lower frequency if maximum
frequency overflows 32 bits (and ignore possible incoherency as we do now).
2011-04-07 23:28:28 +00:00
Jung-uk Kim
49abdda9b7 Set C1 "I/O then Halt" capability bit for Intel EIST. Some broken BIOSes
refuse to load external SSDTs if this bit is unset for _PDC.  It seems Linux
and OpenSolaris did the same long ago.

MFC after:	1 week
2011-02-25 23:14:24 +00:00
Andriy Gapon
40934baa60 hwpstate: use CPU_FOREACH when binding to all available processors
Also, add a comment mentioning _PSD - on some systems it's enough to
put one logical CPU into a particular P-state to make other CPUs in
the same domain to enter that P-state.

Also, call sched_unbind() after the loop - sched_bind() automatically
rebinds from previous CPU to a new one, and the new arrangement of code
is safer against early loop exit.

Plus one minor style nit.

MFC after:	10 days
2010-11-16 12:43:45 +00:00
Andriy Gapon
b3fa872420 make it possible to actually enable hwpstate_verbose
Either via the tunable or the sysctl.

MFC after:	3 days
2010-11-11 17:30:49 +00:00
Alexander Motin
c59528330a Few whitespace cleanups and comments tunings.
Submitted by:	arundel
2010-09-16 02:59:25 +00:00
Alexander Motin
5ec55931d6 Core i5, same as previously Core2Duo, found to not set P-state for single
core lower then set on other cores. Do not try to test P-states on attach
on SMP systems. It is hopeless now and will just pollute verbose logs.
If needed, check still can be forced via loader tunable.
2010-06-19 13:09:42 +00:00
Attilio Rao
3258030144 Introduce the new kernel sub-tree x86 which should contain all the code
shared and generalized between our current amd64, i386 and pc98.

This is just an initial step that should lead to a more complete effort.
For the moment, a very simple porting of cpufreq modules, BIOS calls and
the whole MD specific ISA bus part is added to the sub-tree but ideally
a lot of code might be added and more shared support should grow.

Sponsored by:	Sandvine Incorporated
Reviewed by:	emaste, kib, jhb, imp
Discussed on:	arch
MFC:		3 weeks
2010-02-25 14:13:39 +00:00