coretemp: Only log critical temperature events

According to the Intel manual
https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.pdf
the Thermal Status (0) and Thermal Status Log (1) bits report only a
high temperature on the CPU, not a critical temperature as suggested in
the coretemp driver. Check the Critical Temperature Log (5) instead.
The critical temperature waives guarantees of correct function,
therefore the CPU could have for example written some wrong values into
memory at that point and the OS should be stopped ASAP as the state is
no longer reliable.

Reviewed by: imp (confirmed descriptions of bits, linux ignores these bits)
Pull Request: https://github.com/freebsd/freebsd-src/pull/562
This commit is contained in:
sadaszewski 2021-11-17 08:27:46 +01:00 committed by Warner Losh
parent 70164d957e
commit 8362905cb6

View File

@ -53,6 +53,8 @@ __FBSDID("$FreeBSD$");
#define TZ_ZEROC 2731
#define THERM_CRITICAL_STATUS_LOG 0x20
#define THERM_CRITICAL_STATUS 0x10
#define THERM_STATUS_LOG 0x02
#define THERM_STATUS 0x01
#define THERM_STATUS_TEMP_SHIFT 16
@ -393,7 +395,7 @@ coretemp_get_val_sysctl(SYSCTL_HANDLER_ARGS)
* If we reach a critical level, allow devctl(4)
* to catch this and shutdown the system.
*/
if (msr & THERM_STATUS) {
if (msr & THERM_CRITICAL_STATUS) {
tmp = (msr >> THERM_STATUS_TEMP_SHIFT) &
THERM_STATUS_TEMP_MASK;
tmp = (sc->sc_tjmax - tmp) * 10 + TZ_ZEROC;