=========Watchdog on Linux/Ubuntu=======
===== Background ======
Watchdog timers are commonly found in embedded systems and other computer-controlled equipment where humans cannot easily access the equipment or would be unable to react to faults in a timely manner. In such systems, the computer cannot depend on a human to reboot it if it hangs; it must be self-reliant.
Odroid C2 support watchdog driver **gxbb_wdt** to control the PMU.
===== Test Watchdog module =====
Watchdog driver gxbb_wdt is configurable for Odroid C2.
You should be able to see /dev/watchdog and /dev/watchdog0 device files being created.
odroid@odroid64:~$ ls -la /dev/watchdog*
crw------- 1 root root 10, 130 Feb 11 11:28 /dev/watchdog
crw------- 1 root root 248, 0 Feb 11 11:28 /dev/watchdog0
odroid@odroid64:~$
Watchdog daemon will trigger and reboot if we access the device file manually.
root@odroid64:~# echo 3 > /dev/watchdog
[ 186.570231] watchdog watchdog0: watchdog did not stop!
root@odroid64:~#
To manually stop watchdog to reboot.
# echo V > /dev/watchdog
===== Install Watchdog daemon =====
To install watchdog daemon
sudo apt-get install watchdog
Create dir for watchdog logs files
sudo mkdir -p /var/log/watchdog
Append the default watchdog configuration.
**/etc/default/watchdog**
# Start watchdog at boot time? 0 or 1
run_watchdog=1
# Start wd_keepalive after stopping watchdog? 0 or 1
run_wd_keepalive=1
# Load module before starting watchdog
watchdog_module=gxbb_wdt
# Specify additional watchdog options here (see manpage).
watchdog_options="-s -v -c /etc/watchdog.conf"
===== Watchdog demon configuration files =====
You need to edit the **/etc/watchdog.conf** file to un-comment and so actually use the **/dev/watchdog** device access to the module. Otherwise the watchdog will not use the hardware and rely only on its internal code to soft-reboot a broken machine.
$ cat /etc/watchdog.conf
#ping = 172.31.14.1
#ping = 172.26.1.255
#interface = eth0
#file = /var/log/messages
#change = 1407
# Uncomment to enable test. Setting one of these values to '0' disables it.
# These values will hopefully never reboot your machine during normal use
# (if your machine is really hung, the loadavg will go much higher than 25)
#max-load-1 = 24
#max-load-5 = 18
#max-load-15 = 12
# Note that this is the number of pages!
# To get the real size, check how large the pagesize is on your machine.
#min-memory = 1
#repair-binary = /usr/sbin/repair
#repair-timeout =
#test-binary =
#test-timeout =
watchdog-device = /dev/watchdog
# Defaults compiled into the binary
#temperature-device =
#max-temperature = 120
# Defaults compiled into the binary
admin = root
interval = 1
logtick = 1
log-dir = /var/log/watchdog
# This greatly decreases the chance that watchdog won't be scheduled before
# your machine is really loaded
realtime = yes
priority = 1
# Check if rsyslogd is still running by enabling the following line
pidfile = /var/run/rsyslogd.pid
watchdog-timeout = 15
For more configuration please follow link below.
[[http://www.sat.dundee.ac.uk/psc/watchdog/watchdog-configure.html]]
===== Start Watchdog Service and Verify ======
In order to start service we need to append /etc/rc.local
service watchdog restart
root@odroid64:~#
odroid@odroid64:~$ service watchdog status
● watchdog.service - watchdog daemon
Loaded: loaded (/lib/systemd/system/watchdog.service; static; vendor preset:
Active: active (running) since Wed 2016-06-22 01:52:23 EDT; 10s ago
Process: 1384 ExecStopPost=/bin/sh -c [ $run_wd_keepalive != 1 ] || false (cod
Process: 1959 ExecStart=/bin/sh -c [ $run_watchdog != 1 ] || exec /usr/sbin/wa
Process: 1955 ExecStartPre=/bin/sh -c [ -z "${watchdog_module}" ] || [ "${watc
Main PID: 1961 (watchdog)
CGroup: /system.slice/watchdog.service
└─1961 /usr/sbin/watchdog -s -v -c /etc/watchdog.conf
Jun 22 01:52:30 odroid64 watchdog[1961]: still alive after 6 interval(s)
Jun 22 01:52:30 odroid64 watchdog[1961]: was able to ping process 483 (/var/run/
Jun 22 01:52:31 odroid64 watchdog[1961]: still alive after 7 interval(s)
Jun 22 01:52:31 odroid64 watchdog[1961]: was able to ping process 483 (/var/run/
Jun 22 01:52:32 odroid64 watchdog[1961]: still alive after 8 interval(s)
Jun 22 01:52:32 odroid64 watchdog[1961]: was able to ping process 483 (/var/run/
Jun 22 01:52:33 odroid64 watchdog[1961]: still alive after 9 interval(s)
Jun 22 01:52:33 odroid64 watchdog[1961]: was able to ping process 483 (/var/run/
Jun 22 01:52:34 odroid64 watchdog[1961]: still alive after 10 interval(s)
Jun 22 01:52:34 odroid64 watchdog[1961]: was able to ping process 483 (/var/run/
lines 1-20/20 (END)
Once the watchdog demon is configures then it tries to continuously try to reset the watchdog timer.
Another way to test watchdog device is working under watchdog demon.
root@odroid64:~#
root@odroid64:~# pkill -9 watchdog