Commit Graph

36 Commits

Author SHA1 Message Date
Andriy Gapon
a18f280538 fix a watchdogd regression introduced in r308040
The code assumed that 'timeout' and 'timeout_sec' are in sync
which they weren't if no '-t' option was passed to watchdogd.

Reported by:	Olivier Smedts <olivier@gid0.org>,
		Alex Deiter <alex.deiter@gmail.com>
Tested by:	Olivier Smedts <olivier@gid0.org>,
		Alex Deiter <alex.deiter@gmail.com>
MFC after:	5 days
X-MFC with:	r308040
2016-11-10 10:45:12 +00:00
Andriy Gapon
9cb44c5d21 nap time between pats is forced to be at most half of the timeout
Previously, if the timeout was less than 10 seconds, for example, about
8 seconds, then the watchdog timer would be let to expire before patting
the watchdog.

MFC after:	2 weeks
2016-10-28 14:49:54 +00:00
Ian Lepore
7b4a83b1d0 Add a new exit-timeout option to watchdogd.
Watchdogd currently disables the watchdog when it exits, such as during
rc.shutdown processing.  That leaves the system vulnerable to getting hung
or deadlocked during the shutdown part of a reboot.  For embedded systems
it's especially important that the hardware watchdog always be active.  It
can also be useful for servers that are administered remotely.

The new -x <seconds> option tells watchdogd to program the watchdog with the
given timeout just before exiting.  The -x value can be longer or shorter
than the -t normal time value, to allow for various exceptional conditions
at shutdown such as allowing extra time for buffer flushing.

The exit value is also used internally in the "failsafe" handling (which
used to just disable the watchdog), on the theory that if you're using this
option, "safe" means having the watchdog always running, not disabled.

The default is still to disable the watchdog on exit if -x is not specified.

Differential Revision:	https://reviews.freebsd.org/D2556 (timed out)
2015-08-19 21:46:12 +00:00
Xin LI
dad6df6124 Default to use 10 seconds as nap interval instead of 1.
Previously, we have a nap interval of 1 second while we have a timeout of
128 seconds by default, which could be an overkill, and for some hardware
the patting action may be expensive.

Note that the choice of nap interval is still arbitrary.  We preferred
a safe value where even when the system is very heavily loaded, the
watchdog should not shoot the system down if it's not really hung.
According to the manual page of Linux's watchdog daemon, the nap interval
time of theirs is 10 seconds, which seems to be a reasonable value --
according to Intel documentation AP-725 (Document Number: 292273-001),
ICH5's maximum timeout is about 37.5 seconds, which the ichwd(4) driver
would set when we requested 128 seconds (although it should probably
feed back this as an error and do not set the timeout).  Since that's
the shortest maximum value, 10 seconds seems to be a right choice for
us too.

Discussed with:	alfred
MFC after:	1 month
2014-11-16 09:44:30 +00:00
Alfred Perlstein
907745a810 Fix bug in r253719: fix command line watchdog disable.
r253719 disallowed watchdog(8) from disabling the watchdog
by breaking the ability to pass 0 as a timeout arg.  Fix this.
2013-08-10 01:48:15 +00:00
John Baldwin
22dbec3de7 Apply a casting sledgehammer.
Submitted by:	dhw
2013-07-30 16:20:54 +00:00
Ian Lepore
232b79f5f7 Fix printf of seconds for systems where time_t is 64 bits. 2013-07-28 16:56:31 +00:00
Alfred Perlstein
3d30404f83 Fix watchdog pretimeout.
The original API calls for pow2ns, however the new APIs from
Linux call for seconds.

We need to be able to convert to/from 2^Nns to seconds in both
userland and kernel to fix this and properly compare units.
2013-07-27 20:47:01 +00:00
Ed Schouten
5e49d30e89 Mark the act_tbl static/const.
This table is only used within this source file and is only accessed
read-only.

MFC after:	1 week
2013-04-08 08:05:15 +00:00
Mark Johnston
8d7ad01f94 Invert the meaning of -S (added in r247405) and document its meaning. Also,
don't carp about the watchdog command taking too long until after the
watchdog has been patted, and don't carp via warnx(3) unless -S is set
since syslog(3) already logs to standard error otherwise.

Discussed with:	alfred
Reviewed by:	alfred
Approved by:	emaste (co-mentor)
2013-03-26 19:43:18 +00:00
Alfred Perlstein
4b9b732ac0 watchdogd(8) and watchdog(4) enhancements.
The following support was added to watchdog(4):
- Support to query the outstanding timeout.
- Support to set a software pre-timeout function watchdog with an 'action'
- Support to set a software only watchdog with a configurable 'action'

'action' can be a mask specifying a single operation or a combination of:
 log(9), printf(9), panic(9) and/or kdb_enter(9).

Support the following in watchdogged:
- Support to utilize the new additions to watchdog(4).
- Support to warn if a watchdog script runs for too long.
- Support for "dry run" where we do not actually arm the watchdog,
  but only report on our timing.

Sponsored by:   iXsystems, Inc.
MFC after:      1 month
2013-02-27 19:03:31 +00:00
Ian Lepore
3c5bfb5885 Revert accidental regression to previous misspelling.
Approved by:	cognet (mentor)
2013-01-26 22:02:40 +00:00
Ian Lepore
e6af9f3a37 Reduce watchdogd's memory footprint when running daemonized.
This uses the recently-added jemalloc(3) feature of setting the lg_chunk
tuning option to zero to request that memory be allocated in the smallest
chunks possible.  Without this option, the default is to initally map 8MB,
and then the mlockall() call wires that entire allocation even though the
program only uses a few Kbytes of it at runtime.

PR:		bin/173332
Approved by:	cognet (mentor)
2013-01-26 21:29:45 +00:00
Alfred Perlstein
b647367061 Spelling: exitting -> exiting
MFC after:	2 weeks
2013-01-18 02:36:06 +00:00
Xin LI
652c42600b Replace log(3) with flsll(3) for watchdogd(8) and drop libm dependency.
MFC after:	2 weeks
2012-11-03 18:38:28 +00:00
Andrey Zonov
193e2b5546 - It's also need to lock current memory.
Approved by:	kib (mentor)
MFC after:	1 week
2012-08-30 08:07:37 +00:00
Andrey Zonov
e489ac6c53 - Don't allow watchdogd(8) to be swapped out.
On machines with huge amount of swap and high IO activity,
  watchdogd(8) may wait for a swap memory longer than timeout and
  sometimes fires.

Approved by:	kib (mentor)
MFC after:	1 week
2012-08-28 08:38:53 +00:00
Ed Maste
f145c771fb Protect the watchdog daemon against swap OOM killer. This is similar to
SVN r199804 which added protection to sshd, cron, syslogd, and inetd.
2010-09-26 01:45:33 +00:00
Xin LI
d10d35b369 Staticify local variables.
While I'm there also add a 'static' keyword for a function to make it
consistent with prototype.

Reviewed by:	phk
MFC after:	3 months
2010-07-20 17:42:13 +00:00
Ed Schouten
10bc3a7f42 ANSIfy almost all applications that use WARNS=6.
I was considering committing all these patches one by one, but as
discussed with brooks@, there is no need to do this. If we ever
need/want to merge these changes back, it is still possible to do this
per application.
2009-12-29 22:53:27 +00:00
Ruslan Ermilov
c409ce41b4 Don't hide an error if the initial attempt to program a watchdog from
within watchdogd(8) fails.  This is also consistent with watchdog(8).
2009-12-21 15:50:37 +00:00
Nick Hibma
0f71a1cba6 Don't exit from watchdogd on receiving a signal if we cannot stop the watchdog.
That'll require -KILL. This avoids resetting your system on one of the
watchdogs that you cannot disable.
2006-12-15 22:47:36 +00:00
Poul-Henning Kamp
5c27253056 Fix usage().
Submitted by:	Adrian Steinmann <ast@marabu.ch>
2006-03-06 07:42:52 +00:00
Poul-Henning Kamp
2a5f59b241 Report any errors we might see when disabling the watchdog.
Complain about extra arguments so people don't get surprised
if they type "watchdog 0"
2005-09-30 08:30:20 +00:00
Pawel Jakub Dawidek
8b28aef238 Pidfiles should be created with permission preventing users from opening
them for reading. When user can open file for reading, he can also
flock(2) it, which can lead to confusions.

Pointed out by:	green
2005-09-16 11:24:28 +00:00
Pawel Jakub Dawidek
0ea90af02e Use pidfile(3) in watchdogd(8). 2005-08-24 19:28:33 +00:00
Marius Strobl
bdd466ffca When disarming a watchdog by using an interval of WD_TO_NEVER a non-zero
return value of the ioctl doesn't indicate that the command has failed
so don't let watchdog(8) return an error in this case.

MFC after:	3 days
2005-03-19 01:46:37 +00:00
Brian Feldman
a693939e67 Disable memory locking that could keep watchdogd from deadlocking itself
if the swap subsystem failed.

Requested by:	phk
2004-07-28 22:13:04 +00:00
Brian Feldman
5e60838b5d Now that mlockall(2) is unbroken, use it to keep watchdogd(8) permanently
out of swap.
2004-07-23 15:24:57 +00:00
Sean Kelly
670c8a8ccd Bump the copyright year since I forgot last time. 2004-05-03 21:41:02 +00:00
Sean Kelly
459336c4d8 Update comments to reflect changes made by phk. Also no longer need
<sys/sysctl.h>.
2004-04-28 07:35:03 +00:00
Poul-Henning Kamp
4103b7652d Rename the WATCHDOG option to SW_WATCHDOG and make it use the
generic watchdoc(9) interface.

Make watchdogd(8) perform as watchdog(8) as well, and make it
possible to specify a check command to run, timeout and sleep
periods.

Update watchdog(4) to talk about the generic interface and add
new watchdog(8) page.
2004-02-28 20:56:35 +00:00
Sean Kelly
bc8b08782b o style(9) fixes
- Reordered #includes
  - Only include <sys/types.h>, not it and <sys/cdefs.h>
o style.Makefile(5) fixes
  - No SRCS= line when only one src file with same name as program
o Use warn()/errx() instead of fprintf()
  - Integrated patch from Philippe Charnier <charnier@xp11.frmug.org>

Approved by:	jeff (mentor)
2003-07-03 03:37:04 +00:00
Sean Kelly
055d177fb0 Unbreak this for alpha and friends.
Double pointy hat to me, or something.
2003-06-26 18:36:57 +00:00
Maxim Konovalov
e931de40e8 o Fix typo.
Submitted by:	smkelly
2003-06-26 11:24:10 +00:00
Sean Kelly
370c3cb57c - Add a software watchdog facility.
This commit has two pieces. One half is the watchdog kernel code which lives
primarily in hardclock() in sys/kern/kern_clock.c. The other half is a userland
daemon which, when run, will keep the watchdog from firing while the userland
is intact and functioning.

Approved by:	jeff (mentor)
2003-06-26 09:50:52 +00:00