freebsd-skq/sys
Hans Petter Selasky 10c8755706 Fix for race leading to endless timer interrupts related to
configtimer().

During normal operation "state->nextcallopt" will always be less than
or equal to "state->nextcall" and checking only "state->nextcallopt"
before calling "callout_process()" is sufficient. However when
"configtimer()" is called a race might happen requiring both of these
binary times to be checked.

Short description of race:

1) A configtimer() call will reset both "state->nextcall" and
"state->nextcallopt" to the same binary time.

2) If a "callout_reset()" call happens between "configtimer()" and the
next "callout_process()" call, "state->nextcallopt" will get updated
and "state->nextcall" will remain at the current time. Refer to logic
inside cpu_new_callout().

3) getnextcpuevent() only respects "state->nextcall" and returns this
value over and over again, even if it is in the past, until "now >=
state->nextcallopt" becomes true. Then these two time variables are
corrected by a "callout_process()" call and the situation goes back to
normal.

The problem manifests itself in different ways. The common factor is
the timer process(es) consume all CPU on one or more CPU cores for a
long time, blocking other kernel processes from getting execution
time. This can be seen by very high interrupt counts as displayed by
"vmstat -i | grep timer" right after boot.

When EARLY_AP_STARTUP was enabled in r310177 the likelyhood of hitting
this bug apparently increased.

Example output from "vmstat -i" before patch:
cpu0:timer                          7591         69
cpu9:timer                      39031773     358089
cpu4:timer                          9359         85
cpu3:timer                          9100         83
cpu2:timer                          9620         88

Example output from "vmstat -i" after patch:
cpu0:timer                          4242         34
cpu6:timer                          5531         44
cpu3:timer                          6450         52
cpu1:timer                          4545         36
cpu9:timer                          7153         58

Before the patch cpu9 in the example above, was spinning in a loop in
order to reach 39 million interrupts just a few seconds after
bootup. After the patch the timer interrupt counts are more or less
consistent.

Discussed with:		mav @
Reported by:		several people
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 17:40:31 +00:00
..
amd64 vmm_dev: work around a bogus error with gcc 6.3.0 2017-01-20 13:21:27 +00:00
arm Handle the set capabilities ioctl, letting the hardware checksum be 2017-01-19 14:58:55 +00:00
arm64 Catch up with changes to structure member names. 2017-01-17 22:05:52 +00:00
boot Remove empty ranges property so beri_simplebus can be attached again. 2017-01-18 14:41:59 +00:00
bsm
cam Remove writing 'residual' field of struct ctl_scsiio. 2017-01-17 18:32:47 +00:00
cddl MFV 312436 2017-01-20 15:01:04 +00:00
compat Catch up with changes to structure member names. 2017-01-17 22:05:52 +00:00
conf mppc - Finish pluging NETGRAPH_MPPC_COMPRESSION. 2017-01-20 00:02:11 +00:00
contrib Merge ACPICA 20170119. 2017-01-19 22:07:21 +00:00
crypto libmd: add noexec stack annotation in skein_block_asm.s 2017-01-07 19:26:25 +00:00
ddb Revert r311952. 2017-01-14 22:06:25 +00:00
dev Fix reference to free memory in ixgbe/if_media.c 2017-01-20 17:16:48 +00:00
fs Remove mistakenly merged field. 2017-01-19 20:03:26 +00:00
gdb
geom Report disk addition errors on add or create subcommand. 2017-01-20 13:49:04 +00:00
gnu Add Ingenic X1000 DTS files (unofficial). 2016-11-19 15:03:49 +00:00
i386 Catch up with changes to structure member names. 2017-01-17 22:05:52 +00:00
isa
kern Fix for race leading to endless timer interrupts related to 2017-01-20 17:40:31 +00:00
kgssapi
libkern libkern: Remove obsolete 'register' keyword 2017-01-12 17:02:29 +00:00
mips [ar71xx] add EARLY_PRINTF support for the rest of the non-AR933x SoCs. 2017-01-15 06:35:00 +00:00
modules Use SRCTOP-relative paths to other directories instead of .CURDIR-relative ones 2017-01-20 05:45:07 +00:00
net Fix reference to free memory in ixgbe/if_media.c 2017-01-20 17:16:48 +00:00
net80211 [net80211] allow for MCS16-23 to be statically configured. 2017-01-20 07:43:40 +00:00
netgraph mppc - Finish pluging NETGRAPH_MPPC_COMPRESSION. 2017-01-20 00:02:11 +00:00
netinet Implement kernel support for hardware rate limited sockets. 2017-01-18 13:31:17 +00:00
netinet6 Implement kernel support for hardware rate limited sockets. 2017-01-18 13:31:17 +00:00
netipsec Add direction argument to ipsec_setspidx_inpcb() function. 2017-01-08 12:40:07 +00:00
netnatm
netpfil Initialize IPFW static rules rmlock with RM_RECURSE flag. 2017-01-17 10:50:28 +00:00
netsmb
nfs
nfsclient
nfsserver
nlm
ofed
opencrypto Add support for the fpu_kern(9) KPI on arm64. It hooks into the existing 2016-10-20 09:22:10 +00:00
pc98 Add a COMPAT_FREEBSD11 kernel option. 2016-12-09 18:54:12 +00:00
powerpc Use the explicit expanded form of cmp. 2017-01-18 03:42:21 +00:00
riscv Disable superpages reservations as we don't have implemented them yet. 2016-11-21 12:00:31 +00:00
rpc
security Audit 'fd' and 'cmd' arguments to fcntl(2), and when generating BSM, 2016-11-22 00:41:24 +00:00
sparc64 Trim a few comments on platforms that did not implement mmap of /dev/kmem. 2017-01-13 21:52:53 +00:00
sys Addition of clang nullability qualifiers. 2017-01-20 15:56:40 +00:00
teken
tests
tools Replace using of objdump with elfdump 2017-01-10 18:46:40 +00:00
ufs ffs_vnops: Simplify extattr access 2017-01-19 16:46:05 +00:00
vm Avoid unnecessary page lookups in vm_object_madvise(). 2017-01-15 03:50:08 +00:00
x86 "Buses" is the preferred plural of "bus" 2017-01-15 17:54:01 +00:00
xdr
xen "Buses" is the preferred plural of "bus" 2017-01-15 17:54:01 +00:00
Makefile