Watchdogd currently disables the watchdog when it exits, such as during
rc.shutdown processing. That leaves the system vulnerable to getting hung
or deadlocked during the shutdown part of a reboot. For embedded systems
it's especially important that the hardware watchdog always be active. It
can also be useful for servers that are administered remotely.
The new -x <seconds> option tells watchdogd to program the watchdog with the
given timeout just before exiting. The -x value can be longer or shorter
than the -t normal time value, to allow for various exceptional conditions
at shutdown such as allowing extra time for buffer flushing.
The exit value is also used internally in the "failsafe" handling (which
used to just disable the watchdog), on the theory that if you're using this
option, "safe" means having the watchdog always running, not disabled.
The default is still to disable the watchdog on exit if -x is not specified.
Differential Revision: https://reviews.freebsd.org/D2556 (timed out)
Off by default, build behaves normally.
WITH_META_MODE we get auto objdir creation, the ability to
start build from anywhere in the tree.
Still need to add real targets under targets/ to build packages.
Differential Revision: D2796
Reviewed by: brooks imp
Previously, we have a nap interval of 1 second while we have a timeout of
128 seconds by default, which could be an overkill, and for some hardware
the patting action may be expensive.
Note that the choice of nap interval is still arbitrary. We preferred
a safe value where even when the system is very heavily loaded, the
watchdog should not shoot the system down if it's not really hung.
According to the manual page of Linux's watchdog daemon, the nap interval
time of theirs is 10 seconds, which seems to be a reasonable value --
according to Intel documentation AP-725 (Document Number: 292273-001),
ICH5's maximum timeout is about 37.5 seconds, which the ichwd(4) driver
would set when we requested 128 seconds (although it should probably
feed back this as an error and do not set the timeout). Since that's
the shortest maximum value, 10 seconds seems to be a right choice for
us too.
Discussed with: alfred
MFC after: 1 month
The original API calls for pow2ns, however the new APIs from
Linux call for seconds.
We need to be able to convert to/from 2^Nns to seconds in both
userland and kernel to fix this and properly compare units.
don't carp about the watchdog command taking too long until after the
watchdog has been patted, and don't carp via warnx(3) unless -S is set
since syslog(3) already logs to standard error otherwise.
Discussed with: alfred
Reviewed by: alfred
Approved by: emaste (co-mentor)
The following support was added to watchdog(4):
- Support to query the outstanding timeout.
- Support to set a software pre-timeout function watchdog with an 'action'
- Support to set a software only watchdog with a configurable 'action'
'action' can be a mask specifying a single operation or a combination of:
log(9), printf(9), panic(9) and/or kdb_enter(9).
Support the following in watchdogged:
- Support to utilize the new additions to watchdog(4).
- Support to warn if a watchdog script runs for too long.
- Support for "dry run" where we do not actually arm the watchdog,
but only report on our timing.
Sponsored by: iXsystems, Inc.
MFC after: 1 month
This uses the recently-added jemalloc(3) feature of setting the lg_chunk
tuning option to zero to request that memory be allocated in the smallest
chunks possible. Without this option, the default is to initally map 8MB,
and then the mlockall() call wires that entire allocation even though the
program only uses a few Kbytes of it at runtime.
PR: bin/173332
Approved by: cognet (mentor)
On machines with huge amount of swap and high IO activity,
watchdogd(8) may wait for a swap memory longer than timeout and
sometimes fires.
Approved by: kib (mentor)
MFC after: 1 week
I was considering committing all these patches one by one, but as
discussed with brooks@, there is no need to do this. If we ever
need/want to merge these changes back, it is still possible to do this
per application.