Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
en:c2_watchdog_timer [2016/06/07 19:36] odroid created |
en:c2_watchdog_timer [2016/07/14 04:36] moon.linux [Watchdog demon configuration files] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | -- Under construction -- | + | =========Watchdog on Linux/Ubuntu======= |
+ | |||
+ | ===== Background ====== | ||
+ | |||
+ | Watchdog timers are commonly found in embedded systems and other computer-controlled equipment where humans cannot easily access the equipment or would be unable to react to faults in a timely manner. In such systems, the computer cannot depend on a human to reboot it if it hangs; it must be self-reliant. | ||
+ | |||
+ | Odroid C2 support watchdog driver **gxbb_wdt** to control the PMU. | ||
+ | |||
+ | ===== Test Watchdog module ===== | ||
+ | <WRAP center round important 100%> | ||
+ | Watchdog driver gxbb_wdt is configurable for Odroid C2. | ||
+ | </WRAP> | ||
+ | |||
+ | You should be able to see /dev/watchdog and /dev/watchdog0 device files being created. | ||
+ | |||
+ | <code> | ||
+ | odroid@odroid64:~$ ls -la /dev/watchdog* | ||
+ | crw------- 1 root root 10, 130 Feb 11 11:28 /dev/watchdog | ||
+ | crw------- 1 root root 248, 0 Feb 11 11:28 /dev/watchdog0 | ||
+ | odroid@odroid64:~$ | ||
+ | </code> | ||
+ | |||
+ | Watchdog daemon will trigger and reboot if we access the device file manually. | ||
+ | |||
+ | <code> | ||
+ | root@odroid64:~# echo 3 > /dev/watchdog | ||
+ | [ 186.570231] watchdog watchdog0: watchdog did not stop! | ||
+ | root@odroid64:~# | ||
+ | |||
+ | </code> | ||
+ | |||
+ | To manually stop watchdog to reboot. | ||
+ | |||
+ | <code> | ||
+ | # echo V > /dev/watchdog | ||
+ | </code> | ||
+ | |||
+ | ===== Install Watchdog daemon ===== | ||
+ | To install watchdog daemon | ||
+ | <code> | ||
+ | sudo apt-get install watchdog | ||
+ | </code> | ||
+ | |||
+ | Create dir for watchdog logs files | ||
+ | |||
+ | <code> | ||
+ | sudo mkdir -p /var/log/watchdog | ||
+ | </code> | ||
+ | Append the default watchdog configuration. | ||
+ | **/etc/default/watchdog** | ||
+ | <code> | ||
+ | # Start watchdog at boot time? 0 or 1 | ||
+ | run_watchdog=1 | ||
+ | # Start wd_keepalive after stopping watchdog? 0 or 1 | ||
+ | run_wd_keepalive=1 | ||
+ | # Load module before starting watchdog | ||
+ | watchdog_module=gxbb_wdt | ||
+ | # Specify additional watchdog options here (see manpage). | ||
+ | watchdog_options="-s -v -c /etc/watchdog.conf" | ||
+ | |||
+ | </code> | ||
+ | |||
+ | ===== Watchdog demon configuration files ===== | ||
+ | |||
+ | You need to edit the **/etc/watchdog.conf** file to un-comment and so actually use the **/dev/watchdog** device access to the module. Otherwise the watchdog will not use the hardware and rely only on its internal code to soft-reboot a broken machine. | ||
+ | |||
+ | <code> | ||
+ | $ cat /etc/watchdog.conf | ||
+ | #ping = 172.31.14.1 | ||
+ | #ping = 172.26.1.255 | ||
+ | #interface = eth0 | ||
+ | #file = /var/log/messages | ||
+ | #change = 1407 | ||
+ | |||
+ | # Uncomment to enable test. Setting one of these values to '0' disables it. | ||
+ | # These values will hopefully never reboot your machine during normal use | ||
+ | # (if your machine is really hung, the loadavg will go much higher than 25) | ||
+ | #max-load-1 = 24 | ||
+ | #max-load-5 = 18 | ||
+ | #max-load-15 = 12 | ||
+ | |||
+ | # Note that this is the number of pages! | ||
+ | # To get the real size, check how large the pagesize is on your machine. | ||
+ | #min-memory = 1 | ||
+ | |||
+ | #repair-binary = /usr/sbin/repair | ||
+ | #repair-timeout = | ||
+ | #test-binary = | ||
+ | #test-timeout = | ||
+ | |||
+ | watchdog-device = /dev/watchdog | ||
+ | |||
+ | # Defaults compiled into the binary | ||
+ | #temperature-device = | ||
+ | #max-temperature = 120 | ||
+ | |||
+ | # Defaults compiled into the binary | ||
+ | admin = root | ||
+ | interval = 1 | ||
+ | logtick = 1 | ||
+ | log-dir = /var/log/watchdog | ||
+ | |||
+ | # This greatly decreases the chance that watchdog won't be scheduled before | ||
+ | # your machine is really loaded | ||
+ | realtime = yes | ||
+ | priority = 1 | ||
+ | |||
+ | # Check if rsyslogd is still running by enabling the following line | ||
+ | pidfile = /var/run/rsyslogd.pid | ||
+ | |||
+ | watchdog-timeout = 15 | ||
+ | </code> | ||
+ | |||
+ | For more configuration please follow link below. | ||
+ | [[http://www.sat.dundee.ac.uk/psc/watchdog/watchdog-configure.html]] | ||
+ | ===== Start Watchdog Service and Verify ====== | ||
+ | |||
+ | In order to start service we need to append /etc/rc.local | ||
+ | <code> | ||
+ | service watchdog restart | ||
+ | </code> | ||
+ | |||
+ | <code> | ||
+ | root@odroid64:~# | ||
+ | odroid@odroid64:~$ service watchdog status | ||
+ | ● watchdog.service - watchdog daemon | ||
+ | Loaded: loaded (/lib/systemd/system/watchdog.service; static; vendor preset: | ||
+ | Active: active (running) since Wed 2016-06-22 01:52:23 EDT; 10s ago | ||
+ | Process: 1384 ExecStopPost=/bin/sh -c [ $run_wd_keepalive != 1 ] || false (cod | ||
+ | Process: 1959 ExecStart=/bin/sh -c [ $run_watchdog != 1 ] || exec /usr/sbin/wa | ||
+ | Process: 1955 ExecStartPre=/bin/sh -c [ -z "${watchdog_module}" ] || [ "${watc | ||
+ | Main PID: 1961 (watchdog) | ||
+ | CGroup: /system.slice/watchdog.service | ||
+ | └─1961 /usr/sbin/watchdog -s -v -c /etc/watchdog.conf | ||
+ | |||
+ | Jun 22 01:52:30 odroid64 watchdog[1961]: still alive after 6 interval(s) | ||
+ | Jun 22 01:52:30 odroid64 watchdog[1961]: was able to ping process 483 (/var/run/ | ||
+ | Jun 22 01:52:31 odroid64 watchdog[1961]: still alive after 7 interval(s) | ||
+ | Jun 22 01:52:31 odroid64 watchdog[1961]: was able to ping process 483 (/var/run/ | ||
+ | Jun 22 01:52:32 odroid64 watchdog[1961]: still alive after 8 interval(s) | ||
+ | Jun 22 01:52:32 odroid64 watchdog[1961]: was able to ping process 483 (/var/run/ | ||
+ | Jun 22 01:52:33 odroid64 watchdog[1961]: still alive after 9 interval(s) | ||
+ | Jun 22 01:52:33 odroid64 watchdog[1961]: was able to ping process 483 (/var/run/ | ||
+ | Jun 22 01:52:34 odroid64 watchdog[1961]: still alive after 10 interval(s) | ||
+ | Jun 22 01:52:34 odroid64 watchdog[1961]: was able to ping process 483 (/var/run/ | ||
+ | lines 1-20/20 (END) | ||
+ | |||
+ | </code> | ||
+ | |||
+ | Once the watchdog demon is configures then it tries to continuously try to reset the watchdog timer. | ||
+ | |||
+ | Another way to test watchdog device is working under watchdog demon. | ||
+ | <code> | ||
+ | root@odroid64:~# | ||
+ | root@odroid64:~# pkill -9 watchdog | ||
+ | </code> | ||
+ |