With various firmware files used by graphics and wireless drivers
we are exceeding the current 32 character module name (file path
in kldxref) length.
In order to overcome this issue bump it to the maximum path length
for the next version.
To be able to MFC provide backward compat support for another version
of the struct as the offsets for the second half change due to the
array size increase.
MAXMODNAME being defined to MAXPATHLEN needs param.h to be
included first. With only 7 modules (or LinuxKPI module.h) not
doing that adjust them rather than including param.h in module.h [1].
Reported by: Greg V (greg unrelenting.technology)
Sponsored by: The FreeBSD Foundation
Suggested by: imp [1]
MFC after: 10 days
Reviewed by: imp (and others to different level)
Differential Revision: https://reviews.freebsd.org/D32383
Before this device unit number match was coincidental and broke if I
disabled some CPU device(s). Aside of cosmetics, for some drivers
(may be considered broken) it caused talking to wrong CPUs.
Actually check the wrmsr_safe() return value when setting autonomous
HWP for package.
PR: 245582
Differential Revision: https://reviews.freebsd.org/D24744
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT
Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718
If package-level control is present, we default to using it. Per-core
software control may be enabled by setting the machdep.hwpstate_pkg_ctrl
tunable to "0" in loader.conf(5).
Per Intel SDM (Vol 3b Part 2), if HWP indicates EPP (energy-performance
preference) is not supported, the hardware instead uses the ENERGY_PERF_BIAS
MSR. In the epp sysctl handler, fall back to that MSR if HWP does not
support EPP and CPUID indicates the ENERGY_PERF_BIAS MSR is supported.
I don't know why a Skylake CPU with the HWP feature bit present would trap
on MSR reads of the HWP registers, but if this occurs, do not leave the
attach thread bound. This could conceivably cause reported hangs, although
I have no evidence that this is the cause.
Reported by: ae@, Andreas Nilsson <andrnils AT gmail.com>
X-MFC-With: r357002
Add a sysctl knob to allow users to re-enable it, and document the knob and
default in cpufreq.4. (While here, add a few unrelated updates to
cpufreq.4.)
It seems that the register value in some hardware simply reflects the
configured P-state. This results in an inadvertent and unintended outcome
where the P-state can only walk down, and then the driver becomes "stuck" in
the slowest possible P-state.
The Linux driver never consults this register, so that's some evidence that
ignoring the contents are relatively harmless.
PR: 234733
Reported by: sigsys AT gmail.com, Erich Dollanksy <freebsd.ed.lists AT
sumeritec.com>
These were all introduced in the initial import of hwpstate_intel(4).
Reported by: Coverity
CIDs: 1413161, 1413164, 1413165, 1413167
X-MFC-With: r357002
If we're going to throttle user requested P-states, we should at least produce
a debug log line indicating the surprising behavior.
PR: inspired by 234733
DRIVER_MODULE does not actually define a MODULE_VERSION, which is required
to satisfy a MODULE_DEPENDency. Declare one explicitly in
hwpstate_intel(4).
Reported by: flo
X-MFC-With: r357002
Intel Speed Shift is Intel's technology to control frequency in hardware,
with hints from software.
Let's get a working version of this in the tree and we can refine it from
here.
Submitted by: bwidawsk, scottph
Reviewed by: bcr (manpages), myself
Discussed with: jhb, kib (earlier versions)
With feedback from: Greg V, gallatin, freebsdnewbie AT freenet.de
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D18028
As a new x86 CPU vendor, Chengdu Haiguang IC Design Co., Ltd (Hygon)
is a joint venture between AMD and Haiguang Information Technology Co.,
Ltd., aims at providing x86 processors for China server market.
The first generation Hygon processor(Dhyana) shares most architecture
with AMD's family 17h, but with different CPU vendor ID("HygonGenuine")
and PCI vendor ID(0x1d94) and family series number 18h(Hygon negotiated
with AMD to confirm that only Hygon use family 18h).
To enable Hygon Dhyana support in FreeBSD, add new definitions
HYGON_VENDOR_ID("HygonGenuine") and X86_VENDOR_HYGON(0x1d94) to identify
Hygon Dhyana CPU.
Initialize the CPU features(topology, local APIC ext, MSI, TSC, hwpstate,
MCA, DEBUG_CTL, etc) for amd64 and i386 mode by sharing the code path of
AMD family 17h.
The changes have been applied on FreeBSD 13.0-CURRENT and tested
successfully on Hygon Dhyana processor.
References:
[1] Linux kernel patches for Hygon Dhyana, merged in 4.20:
https://git.kernel.org/tip/c9661c1e80b609cd038db7c908e061f0535804ef
[2] MSR and CPUID definition:
https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf
Submitted by: Pu Wen <puwen@hygon.cn>
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D23163
This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.
EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).
As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions. The remainder of the patch addresses
adding appropriate includes to fix those files.
LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).
No functional change (intended). Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed. __FreeBSD_version has been bumped.
Uses of mallocarray(9).
The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.
Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.
Reported by: wosch
PR: 225197
Focus on code where we are doing multiplications within malloc(9). None of
these ire likely to overflow, however the change is still useful as some
static checkers can benefit from the allocation attributes we use for
mallocarray.
This initial sweep only covers malloc(9) calls with M_NOWAIT. No good
reason but I started doing the changes before r327796 and at that time it
was convenient to make sure the sorrounding code could handle NULL values.
X-Differential revision: https://reviews.freebsd.org/D13837
turn it off by default. It is very inefficient to verify current P-state of
each core, especially for CPUs with many cores. When multiple commands are
requested to the same power domain before completion of pending transitions,
the last command is executed according to the manual. Because requests are
serialized by the caller, all cores will receive the same command for each
call. Do not call sched_bind() and sched_unbind(). It is redundant because
the caller does it anyway.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
It doesn't seem necessary to busy the CPU while waiting to transition
into a different p-state.
PR: 221621 (related, but does not completely address)
Reviewed by: truckman
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12260
This information is normally available via acpi_perf, but in case it is not,
add support for fetching the information via MSRs on AMD family 17h (Zen)
processors. Zen uses a slightly different formula than previous generation
AMD CPUs.
This was inspired by, but does not fix, PR 221621.
Reported by: Sean P. R. <seanpr AT swbell.net>
Reviewed by: mjoras@
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12082
actual numbers would help debugging (also, `MSR' and `ACPI' are standard
abbreviations and thus should be properly capitalized)
- Rephrase unsupported AMD CPUs message and wrap as an overly long line:
`sorry' 1) is wrongly spelled after period (starts with a small letter)
and 2) carries emotional "tinge" that is unnecessary and even bogus in
debug message; `implemented' is not the best word as `supported' suits
better in this context
- Improve readability when reporting resulted P-state transition (debug)
Approved by: jhb
Use the same logic to calculate the nominal CPU frequency from the P-state
MSRs on family 0x12, 0x15, and 0x16 CPUs as is used for family 0x10.
Family 0x14 was included in the original patch in the PR but I left that
out as the BIOS writer's guide for family 0x14 CPUs show a different layout
for the relevant MSR and include a different formulate for calculating the
frequency.
While here, simplify a few expressions and print out the family of
unsupported CPUs in hex rather than decimal.
PR: 212020
Submitted by: Anthony Jenkins <Scoobi_doo@yahoo.com>
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D7587
Uses of commas instead of a semicolons can easily go undetected. The comma
can serve as a statement separator but this shouldn't be abused when
statements are meant to be standalone.
Detected with devel/coccinelle following a hint from DragonFlyBSD.
MFC after: 1 month
These changes prevent sysctl(8) from returning proper output,
such as:
1) no output from sysctl(8)
2) erroneously returning ENOMEM with tools like truss(1)
or uname(1)
truss: can not get etype: Cannot allocate memory
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.
Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.
MFC after: 2 weeks
Sponsored by: Mellanox Technologies
Intel manual says: "If a transition is already in progress, transition to
a new value will subsequently take effect. Reads of IA32_PERF_CTL determine
the last targeted operating point." So seems it should be fine to just
trigger wanted transition and go. Linux does the same.
MFC after: 1 month
safer for i386 because it can be easily over 4 GHz now. More worse, it can
be easily changed by user with 'machdep.tsc_freq' tunable (directly) or
cpufreq(4) (indirectly). Note it is intentionally not used in performance
critical paths to avoid performance regression (but we should, in theory).
Alternatively, we may add "virtual TSC" with lower frequency if maximum
frequency overflows 32 bits (and ignore possible incoherency as we do now).
Also, add a comment mentioning _PSD - on some systems it's enough to
put one logical CPU into a particular P-state to make other CPUs in
the same domain to enter that P-state.
Also, call sched_unbind() after the loop - sched_bind() automatically
rebinds from previous CPU to a new one, and the new arrangement of code
is safer against early loop exit.
Plus one minor style nit.
MFC after: 10 days