Linux Watchdog Driver: The Complete Beginner’s Guide (2024)

On: May 3, 2026
Linux Watchdog Driver

Learn how the Linux watchdog driver works, how to write one, configure /dev/watchdog, use ioctl commands, and set up the watchdog daemon. Code included.

1. What Is a Watchdog Driver in Linux?

Imagine you have a robot that is supposed to check in with you every 30 seconds. If it stops checking in, you assume something went wrong and you reset it. That is exactly what a watchdog timer does for your Linux system.

A Linux watchdog driver is a kernel-level component that monitors system health. It works by requiring your application or a daemon to send a periodic signal — commonly called a “ping” or “keepalive” — to the watchdog device. If that signal stops arriving within a configured timeout window, the watchdog assumes the system has frozen or hung, and it automatically triggers a reboot.

This is not just a nice-to-have feature for servers. In embedded Linux systems — think routers, industrial controllers, automotive ECUs, medical devices — a watchdog timer is often the only safety net between a production system and a permanent hang that nobody can fix remotely.

The Linux kernel has had watchdog support since very early days, but the unified Linux watchdog framework introduced in kernel version 3.5 cleaned things up significantly and gave driver developers a proper, consistent API to work with.

So when people talk about the “watchdog driver in Linux,” they typically mean one of two things:

  • The hardware watchdog driver: a kernel module that talks to actual hardware circuitry on your board, which is independent of the CPU. Even if the CPU hangs completely, the hardware watchdog can fire and reset the system.
  • The software watchdog driver (usually softdog): a kernel module that uses Linux kernel timers to simulate watchdog behavior, without requiring dedicated hardware.

Both of these expose themselves to user space through the /dev/watchdog device file, and both follow the same watchdog driver API defined by the Linux kernel watchdog framework.


2. How Does the Watchdog Timer Work in Linux?

Let’s trace the full lifecycle so you really understand the mechanics, not just the buzzwords.

Step 1 — Open the device.
When your application (or the watchdog daemon) opens /dev/watchdog, the watchdog timer starts. This is the trigger point. From this moment on, the countdown begins.

Step 2 — Keepalive pings keep it alive.
Your application must periodically write something to /dev/watchdog — even a single byte — or send an ioctl call with WDIOC_KEEPALIVE. This resets the countdown timer back to its full timeout value. As long as these pings keep coming, nothing bad happens.

Step 3 — Missing a ping triggers a reset.
If the system hangs, goes into an infinite loop, panics in kernel space, or your application crashes and nobody else is pinging — the watchdog timer reaches zero and kicks in. For a hardware watchdog, this means the hardware directly asserts a reset line on the SoC. For the software watchdog (softdog), it calls the kernel’s emergency restart function.

Step 4 — Close behavior with the “magic close” feature.
One gotcha that trips up beginners: you cannot just close /dev/watchdog to safely stop the timer in all cases. The Linux watchdog driver API includes a concept called “magic close.” If you write the character 'V' to /dev/watchdog before closing it, the driver understands you intentionally stopped the watchdog and will not trigger a reboot. If you just close it without the magic close, many drivers will still trigger a reset — which is intentional. It prevents crashes from accidentally leaving the watchdog unguarded.

Here is a minimal C example of how this works at the user-space level:

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include <linux/watchdog.h>

int main(void)
{
    int fd = open("/dev/watchdog", O_RDWR);
    if (fd == -1) {
        perror("open /dev/watchdog");
        return 1;
    }

    printf("Watchdog started. Pinging every 10 seconds...\n");

    while (1) {
        /* This write resets the watchdog timer — the "ping" or keepalive */
        write(fd, "1", 1);
        sleep(10);
    }

    /* If you want to safely stop the watchdog, send the magic close character */
    write(fd, "V", 1);
    close(fd);
    return 0;
}

This is the most fundamental Linux watchdog example. Your daemon does this in a loop, and if the system hangs, the loop stops, the ping stops arriving, and the watchdog reboots the system automatically.


3. Hardware Watchdog vs Software Watchdog in Linux

This is one of the most common questions people have when they first get into Linux watchdog timers, so let’s clear it up properly.

Hardware Watchdog Linux

A hardware watchdog is a dedicated circuit built into your SoC (System-on-Chip) or available as a separate IC on your PCB. It operates completely independently from the main CPU. The watchdog hardware has its own oscillator, its own timer counter, and direct electrical control over the system reset line.

Why does this matter? Because even if your CPU completely locks up — kernel panic, dead interrupt handlers, whatever — the hardware watchdog circuit is still ticking. It does not care what your CPU is doing. When its counter reaches zero, it asserts the reset signal and brings the system back to life.

Practically every embedded SoC you will encounter — TI AM335x, NXP i.MX6, Allwinner, Raspberry Pi’s BCM2835, Xilinx Zynq — has a hardware watchdog built in. The Linux kernel includes specific drivers for most of them (under drivers/watchdog/ in the kernel source tree).

Software Watchdog Linux

The software watchdog, implemented by the softdog kernel module, does not rely on external hardware. It uses the kernel’s internal timer subsystem to simulate watchdog behavior.

The critical limitation: if the kernel itself freezes — say, a scheduling bug prevents timer interrupts from firing, or you hit a really bad deadlock — the software watchdog cannot save you. It is software running on the same CPU that just died.

That said, softdog is incredibly useful for:

  • Development machines where you have no hardware watchdog
  • Virtual machines (VMs) — most hypervisors do not expose hardware watchdog timers
  • Quick testing of watchdog behavior without needing specific hardware

Which one should you use?

If you are building a production embedded Linux system for anything critical (industrial, automotive, medical, networking infrastructure), you want a hardware watchdog. Always. The software watchdog is good for development and testing, but it is not a substitute for real hardware-level protection.

FeatureHardware WatchdogSoftware Watchdog (softdog)
Survives kernel freezeYesNo
Requires specific hardwareYesNo
Works in VMsSometimes (paravirt)Yes
Production embedded systemsStrongly recommendedNot recommended
Available as /dev/watchdogYesYes
Uses same APIYesYes

4. The Linux Watchdog Framework (Kernel Architecture)

Prior to kernel 3.5, every watchdog driver in the Linux kernel basically did its own thing. Each driver had its own implementation of the file_operations struct, its own ioctl handling, its own timeout management. This led to inconsistencies, bugs, and a lot of duplicated code.

The Linux watchdog framework (introduced by Wim Van Sebroeck and others) standardized all of this. Now, writing a Linux watchdog driver is much cleaner. Here is how the architecture looks:

 User Space Application
        |
        | open/write/ioctl
        v
   /dev/watchdog  (character device, major 10, minor 130)
        |
        v
   watchdog_dev.c  (watchdog core — handles file_operations)
        |
        v
   watchdog_ops   (your driver implements these)
        |
        v
   Hardware / softdog timer

The core lives in drivers/watchdog/watchdog_core.c and drivers/watchdog/watchdog_dev.c. These files handle everything generic — the device registration, the character device operations, the ping tracking, the ioctl dispatch. Your driver just needs to fill in a watchdog_ops struct and a watchdog_device struct.

The key data structures you need to know are:

struct watchdog_device — describes your watchdog device to the framework:

struct watchdog_device {
    int id;
    struct device *parent;
    const struct watchdog_info *info;
    const struct watchdog_ops *ops;
    unsigned int timeout;       /* current timeout in seconds */
    unsigned int min_timeout;   /* minimum allowed timeout */
    unsigned int max_timeout;   /* maximum allowed timeout */
    unsigned long status;       /* status flags */
    /* ... more fields */
};

struct watchdog_ops — the operations your driver must implement:

struct watchdog_ops {
    struct module *owner;
    /* mandatory */
    int (*start)(struct watchdog_device *);
    int (*stop)(struct watchdog_device *);
    /* optional but highly recommended */
    int (*ping)(struct watchdog_device *);
    unsigned int (*status)(struct watchdog_device *);
    int (*set_timeout)(struct watchdog_device *, unsigned int);
    long (*ioctl)(struct watchdog_device *, unsigned int, unsigned long);
};

struct watchdog_info — information exposed to user space via the WDIOC_GETSUPPORT ioctl:

struct watchdog_info {
    __u32 options;          /* what the watchdog can do */
    __u32 firmware_version;
    __u8  identity[32];     /* name of the watchdog */
};

This clean framework is what makes the modern Linux watchdog driver API elegant. You implement maybe five functions, fill two structs, call watchdog_register_device(), and the framework handles the rest.


5. What Is /dev/watchdog in Linux?

/dev/watchdog is the userspace interface to the Linux watchdog subsystem. It is a character device with major number 10 and minor number 130. There is also /dev/watchdog0, /dev/watchdog1, etc., for systems with multiple watchdog devices.

When you do ls -la /dev/watchdog on a system with a watchdog enabled, you get something like:

crw------- 1 root root 10, 130 Jan 10 08:22 /dev/watchdog

Only root can access it by default, which makes sense — you do not want random processes accidentally (or intentionally) triggering a system reboot.

Opening /dev/watchdog starts the timer. That is worth saying again because it is a source of confusion. The moment any process opens this device, the watchdog starts counting down. If you open it just to check something and then forget to ping it, your system will reboot. Respect the device.

Only one process can open it at a time. The watchdog device is exclusive. If you try to open it from two processes simultaneously, the second one gets -EBUSY. This prevents race conditions where two processes both think they are managing the watchdog.

The device supports a standard set of operations:

  • open() — starts the watchdog
  • write() — sends a keepalive ping (resets the timer)
  • ioctl() — queries/sets timeout, gets status, triggers reboot, etc.
  • close() — if preceded by writing 'V' (magic close), stops the watchdog safely

Here is a quick example of querying the watchdog timeout using ioctl:

#include <linux/watchdog.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <stdio.h>

int main(void)
{
    int fd = open("/dev/watchdog", O_RDWR);
    int timeout;

    ioctl(fd, WDIOC_GETTIMEOUT, &timeout);
    printf("Current watchdog timeout: %d seconds\n", timeout);

    /* Set a new timeout */
    timeout = 30;
    ioctl(fd, WDIOC_SETTIMEOUT, &timeout);
    printf("New watchdog timeout: %d seconds\n", timeout);

    /* Send magic close and stop */
    write(fd, "V", 1);
    close(fd);
    return 0;
}

6. The Linux Watchdog Driver API Explained

The watchdog driver API is the set of kernel interfaces defined in include/linux/watchdog.h that you use both when writing a driver and when interacting with the watchdog from user space.

Let’s look at both sides.

Kernel-Side API (Writing a Driver)

When you write a Linux watchdog driver, you interact with these key kernel functions:

/* Register your watchdog with the framework */
int watchdog_register_device(struct watchdog_device *wdd);

/* Unregister (call in your module's exit function) */
void watchdog_unregister_device(struct watchdog_device *wdd);

/* Check if the watchdog is running */
bool watchdog_active(struct watchdog_device *wdd);

/* Check if the watchdog is stopped */
bool watchdog_stopped(struct watchdog_device *wdd);

/* Set the watchdog as running (use in your start() op) */
void watchdog_set_running(struct watchdog_device *wdd);

/* For hardware with nowayout (cannot be stopped once started) */
WATCHDOG_NOWAYOUT_INIT_STATUS(status);

User-Space API (Using /dev/watchdog)

From user space, you interact via the standard ioctl() system call. The commands are defined in linux/watchdog.h:

/* Get support information (struct watchdog_info) */
#define WDIOC_GETSUPPORT   _IOR(WATCHDOG_IOCTL_BASE, 0, struct watchdog_info)

/* Get watchdog status */
#define WDIOC_GETSTATUS    _IOR(WATCHDOG_IOCTL_BASE, 1, int)

/* Get boot status (was last reset caused by watchdog?) */
#define WDIOC_GETBOOTSTATUS _IOR(WATCHDOG_IOCTL_BASE, 2, int)

/* Trigger an immediate reboot via watchdog */
#define WDIOC_SETOPTIONS   _IOR(WATCHDOG_IOCTL_BASE, 4, int)

/* Send a keepalive ping */
#define WDIOC_KEEPALIVE    _IOR(WATCHDOG_IOCTL_BASE, 5, int)

/* Set the timeout (in seconds) */
#define WDIOC_SETTIMEOUT   _IOWR(WATCHDOG_IOCTL_BASE, 6, int)

/* Get the current timeout */
#define WDIOC_GETTIMEOUT   _IOR(WATCHDOG_IOCTL_BASE, 7, int)

/* Get time remaining before reset */
#define WDIOC_GETTIMELEFT  _IOR(WATCHDOG_IOCTL_BASE, 10, int)

The WATCHDOG_IOCTL_BASE is defined as 'W' (0x57), so WDIOC_KEEPALIVE expands to _IOR('W', 5, int).

Watchdog Options Flags

The watchdog_info.options field tells you what capabilities your watchdog hardware/driver supports. Common flags include:

#define WDIOF_SETTIMEOUT    0x0001  /* Can set timeout */
#define WDIOF_MAGICCLOSE    0x0100  /* Supports magic close char */
#define WDIOF_KEEPALIVEPING 0x8000  /* Keepalive ping reply */
#define WDIOF_CARDRESET     0x0020  /* Card previously reset the CPU */

7. Watchdog ioctl Commands — WDIOC_KEEPALIVE, SETTIMEOUT, and More

Let’s dig into the most important ioctl commands with real code examples.

WDIOC_KEEPALIVE — Send a Ping

This is the most important ioctl. Instead of writing to the device, you can send a keepalive using ioctl:

int fd = open("/dev/watchdog", O_RDWR);

/* Send keepalive via ioctl */
if (ioctl(fd, WDIOC_KEEPALIVE, 0) < 0) {
    perror("WDIOC_KEEPALIVE failed");
}

Note: writing any byte to /dev/watchdog is equivalent and often simpler. Most real implementations just do write(fd, "\0", 1) in a loop.

WDIOC_SETTIMEOUT and WDIOC_GETTIMEOUT — Watchdog Timeout Linux

int timeout = 60;  /* 60 seconds */

/* Set the timeout */
if (ioctl(fd, WDIOC_SETTIMEOUT, &timeout) < 0) {
    perror("WDIOC_SETTIMEOUT failed");
}

/* The driver may round to the nearest supported value,
   so always read back what was actually set */
if (ioctl(fd, WDIOC_GETTIMEOUT, &timeout) < 0) {
    perror("WDIOC_GETTIMEOUT failed");
}
printf("Watchdog timeout set to: %d seconds\n", timeout);

Important: not all hardware watchdogs support arbitrary timeout values. Some only support specific values (e.g., powers of two based on clock dividers). The driver rounds your requested timeout to the nearest supported value. Always read back the actual timeout after setting it.

WDIOC_GETBOOTSTATUS — Did the Watchdog Cause the Last Reboot?

This is incredibly useful for diagnosing production issues:

int boot_status;
if (ioctl(fd, WDIOC_GETBOOTSTATUS, &boot_status) < 0) {
    perror("WDIOC_GETBOOTSTATUS failed");
}

if (boot_status & WDIOF_CARDRESET) {
    printf("System was reset by the watchdog!\n");
    /* Log this, alert your monitoring system, etc. */
} else {
    printf("Normal boot (not watchdog-triggered)\n");
}

WDIOC_GETSUPPORT — Query Driver Capabilities

struct watchdog_info ident;
if (ioctl(fd, WDIOC_GETSUPPORT, &ident) < 0) {
    perror("WDIOC_GETSUPPORT failed");
}

printf("Watchdog identity: %s\n", ident.identity);
printf("Firmware version: %u\n", ident.firmware_version);
printf("Options: 0x%08x\n", ident.options);

if (ident.options & WDIOF_SETTIMEOUT)
    printf("  - Can set timeout\n");
if (ident.options & WDIOF_MAGICCLOSE)
    printf("  - Supports magic close\n");
if (ident.options & WDIOF_CARDRESET)
    printf("  - Can detect watchdog reset\n");

WDIOC_GETTIMELEFT — Time Until Reboot

int timeleft;
if (ioctl(fd, WDIOC_GETTIMELEFT, &timeleft) < 0) {
    perror("WDIOC_GETTIMELEFT failed");
}
printf("Time left before watchdog fires: %d seconds\n", timeleft);

8. How to Write a Watchdog Driver in Linux (Step-by-Step with Code)

This is the part where things get really interesting. Let’s write a complete, minimal Linux watchdog driver that uses the modern watchdog framework. This is a great tutorial starting point.

We will write a simple platform driver that simulates hardware watchdog behavior using a kernel timer (essentially how softdog works, but structured properly with the watchdog framework API).

Step 1 — Set Up the Module Skeleton

// File: my_watchdog.c
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/platform_device.h>
#include <linux/watchdog.h>
#include <linux/timer.h>

#define DRIVER_NAME     "my_watchdog"
#define WDT_DEFAULT_TIMEOUT  30   /* 30 seconds */
#define WDT_MIN_TIMEOUT      1
#define WDT_MAX_TIMEOUT      120

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Your Name");
MODULE_DESCRIPTION("A simple Linux watchdog driver example");

Step 2 — Define Driver Private Data

struct my_wdt_dev {
    struct watchdog_device wdd;
    struct timer_list timer;
    spinlock_t lock;
    bool running;
};

Step 3 — Implement the Watchdog Operations

static int my_wdt_start(struct watchdog_device *wdd)
{
    struct my_wdt_dev *wdev = watchdog_get_drvdata(wdd);

    spin_lock(&wdev->lock);
    wdev->running = true;
    /* In real hardware: write to the hardware register to start the timer */
    /* Here we use a kernel timer to simulate it */
    mod_timer(&wdev->timer,
              jiffies + msecs_to_jiffies(wdd->timeout * 1000));
    spin_unlock(&wdev->lock);

    pr_info("%s: watchdog started (timeout=%u seconds)\n",
            DRIVER_NAME, wdd->timeout);
    return 0;
}

static int my_wdt_stop(struct watchdog_device *wdd)
{
    struct my_wdt_dev *wdev = watchdog_get_drvdata(wdd);

    spin_lock(&wdev->lock);
    wdev->running = false;
    del_timer(&wdev->timer);
    spin_unlock(&wdev->lock);

    pr_info("%s: watchdog stopped\n", DRIVER_NAME);
    return 0;
}

static int my_wdt_ping(struct watchdog_device *wdd)
{
    struct my_wdt_dev *wdev = watchdog_get_drvdata(wdd);

    spin_lock(&wdev->lock);
    if (wdev->running) {
        /* Reset the countdown — this is the "keepalive ping" */
        mod_timer(&wdev->timer,
                  jiffies + msecs_to_jiffies(wdd->timeout * 1000));
    }
    spin_unlock(&wdev->lock);
    return 0;
}

static int my_wdt_set_timeout(struct watchdog_device *wdd,
                               unsigned int timeout)
{
    wdd->timeout = timeout;
    if (watchdog_active(wdd))
        my_wdt_ping(wdd);  /* Restart with new timeout */
    return 0;
}

/* This fires when the watchdog expires (no ping received in time) */
static void my_wdt_timer_callback(struct timer_list *t)
{
    struct my_wdt_dev *wdev = from_timer(wdev, t, timer);

    if (wdev->running) {
        pr_crit("%s: watchdog timer expired! Triggering system reboot.\n",
                DRIVER_NAME);
        emergency_restart();
    }
}

Step 4 — Fill in the Framework Structures

static const struct watchdog_info my_wdt_info = {
    .options = WDIOF_SETTIMEOUT | WDIOF_MAGICCLOSE | WDIOF_KEEPALIVEPING,
    .identity = "My Example Watchdog",
    .firmware_version = 1,
};

static const struct watchdog_ops my_wdt_ops = {
    .owner      = THIS_MODULE,
    .start      = my_wdt_start,
    .stop       = my_wdt_stop,
    .ping       = my_wdt_ping,
    .set_timeout = my_wdt_set_timeout,
};

Step 5 — Probe and Remove Functions

static int my_wdt_probe(struct platform_device *pdev)
{
    struct my_wdt_dev *wdev;
    int ret;

    wdev = devm_kzalloc(&pdev->dev, sizeof(*wdev), GFP_KERNEL);
    if (!wdev)
        return -ENOMEM;

    spin_lock_init(&wdev->lock);
    timer_setup(&wdev->timer, my_wdt_timer_callback, 0);

    /* Set up the watchdog_device structure */
    wdev->wdd.info     = &my_wdt_info;
    wdev->wdd.ops      = &my_wdt_ops;
    wdev->wdd.timeout  = WDT_DEFAULT_TIMEOUT;
    wdev->wdd.min_timeout = WDT_MIN_TIMEOUT;
    wdev->wdd.max_timeout = WDT_MAX_TIMEOUT;
    wdev->wdd.parent   = &pdev->dev;

    watchdog_set_drvdata(&wdev->wdd, wdev);
    watchdog_set_nowayout(&wdev->wdd, false); /* allow stopping */

    /* Register with the watchdog framework */
    ret = watchdog_register_device(&wdev->wdd);
    if (ret) {
        dev_err(&pdev->dev, "Failed to register watchdog: %d\n", ret);
        return ret;
    }

    platform_set_drvdata(pdev, wdev);
    dev_info(&pdev->dev, "My watchdog driver initialized\n");
    return 0;
}

static int my_wdt_remove(struct platform_device *pdev)
{
    struct my_wdt_dev *wdev = platform_get_drvdata(pdev);

    watchdog_unregister_device(&wdev->wdd);
    del_timer_sync(&wdev->timer);
    return 0;
}

Step 6 — Module Registration

static struct platform_driver my_wdt_driver = {
    .probe  = my_wdt_probe,
    .remove = my_wdt_remove,
    .driver = {
        .name  = DRIVER_NAME,
        .owner = THIS_MODULE,
    },
};

module_platform_driver(my_wdt_driver);

Step 7 — The Makefile

obj-m += my_watchdog.o

KDIR ?= /lib/modules/$(shell uname -r)/build

all:
	make -C $(KDIR) M=$(PWD) modules

clean:
	make -C $(KDIR) M=$(PWD) clean

Build with make, insert with insmod my_watchdog.ko, and you should see /dev/watchdog0 appear.


9. How to Enable Watchdog in Linux Kernel (Step-by-Step)

If you are building a custom Linux kernel (common in embedded work), you need to make sure watchdog support is compiled in. Here is how to do it step by step.

Step 1 — Open Kernel Configuration

cd /path/to/linux-kernel-source
make menuconfig

Step 2 — Navigate to Watchdog Support

In menuconfig, go to:

Device Drivers --->
  [*] Watchdog Timer Support --->

Step 3 — Enable Relevant Drivers

Under “Watchdog Timer Support” you will see:

<*> Software watchdog (softdog)
< > WDT Watchdog timer
... (platform-specific drivers for your hardware)

For a Raspberry Pi, look for BCM2835 watchdog. For an x86 machine, iTCO watchdog is the Intel TCO watchdog driver. For ARM-based SoCs, look for your specific SoC name.

Enable the one you need as either built-in (<*>) or module (<M>). For embedded systems where you need the watchdog very early in boot, built-in is safer.

Step 4 — Enable CONFIG Options Directly (Alternative)

If you prefer editing .config directly:

# Core watchdog support
CONFIG_WATCHDOG=y

# Software watchdog
CONFIG_SOFT_WATCHDOG=y

# Intel TCO watchdog (x86)
CONFIG_ITCO_WDT=y

# For embedded ARM (example: i.MX watchdog)
CONFIG_IMX2_WDT=y

Step 5 — Check if Watchdog Is Already Running

On a running system, you can check the watchdog status without a custom tool:

# Check if /dev/watchdog exists
ls -la /dev/watchdog*

# Load the softdog module if no hardware watchdog is present
sudo modprobe softdog

# Check kernel messages for watchdog activity
dmesg | grep -i watchdog

# Check if the watchdog daemon is running
systemctl status watchdog

Step 6 — Test Basic Watchdog Functionality

# This starts the watchdog. WARNING: If you do not ping it or kill it properly,
# your system WILL reboot after the timeout.
sudo bash -c 'echo "test" > /dev/watchdog && sleep 5 && echo V > /dev/watchdog'

The final echo V sends the magic close character, safely stopping the watchdog.


10. Linux Watchdog Module Parameters

When loading watchdog modules (either your custom driver or built-in ones like softdog), you can pass parameters to configure their behavior at load time.

softdog Module Parameters

# Set timeout to 60 seconds when loading
sudo modprobe softdog soft_margin=60

# Enable noboot (do not reboot, just log)
sudo modprobe softdog soft_noboot=1

# Nowayout mode: once started, cannot be stopped
sudo modprobe softdog nowayout=1

The key parameters for softdog:

  • soft_margin — the default timeout in seconds (default: 60)
  • soft_noboot — if set to 1, the watchdog logs an expiry but does not reboot (useful for testing)
  • nowayout — prevents the watchdog from being stopped once started (boot parameter security)

Setting Module Parameters Persistently

To make these persistent across reboots, edit or create a file in /etc/modprobe.d/:

# /etc/modprobe.d/softdog.conf
options softdog soft_margin=60 nowayout=0

Parameters via Kernel Command Line

For built-in watchdog drivers (compiled as y not m), you set parameters via the kernel command line in your bootloader (GRUB, U-Boot, etc.):

# In GRUB config / U-Boot environment
linux /vmlinuz ... softdog.soft_margin=60

Checking Currently Active Parameters

# See what parameters a loaded module supports
modinfo softdog

# Check current values for loaded module
cat /sys/module/softdog/parameters/soft_margin
cat /sys/module/softdog/parameters/nowayout

11. Linux Watchdog Daemon Configuration with systemd

For production systems, you rarely ping /dev/watchdog directly from your application. Instead, you use a dedicated watchdog daemon (watchdogd) that handles the keepalive pings and can also perform additional health checks before deciding whether to keep the system alive.

Installing the Watchdog Daemon

On Debian/Ubuntu:

sudo apt-get install watchdog

On RHEL/Fedora:

sudo dnf install watchdog

The Watchdog Daemon Configuration File

The main config file is /etc/watchdog.conf. Here is a well-documented example covering the most important options:

# /etc/watchdog.conf

# The watchdog device to use
watchdog-device = /dev/watchdog

# Timeout for the watchdog hardware (seconds)
# Must be greater than the interval
watchdog-timeout = 60

# How often the daemon sends a keepalive ping (seconds)
# Should be significantly less than watchdog-timeout
interval = 10

# Maximum load average before the daemon triggers a reboot
max-load-1  = 24
max-load-5  = 18
max-load-15 = 12

# Reboot if available memory drops below this (pages)
min-memory = 1

# Ping these files — if they do not get written, reboot
# (useful for application-level health checks)
# file = /var/run/my-app.heartbeat
# change = 180

# Test network connectivity by pinging this host
# If it becomes unreachable, reboot
# ping = 192.168.1.1

# Log to syslog
log-dir = /var/log/watchdog

# User tests — run this script, reboot if it returns non-zero
# test-binary = /usr/local/bin/system-health-check.sh

# Repair binary — try this first before rebooting
# repair-binary = /usr/local/bin/attempt-recovery.sh

# Temperature monitoring
# temperature-sensor = /dev/temperature-sensor-device
# max-temperature = 90

Enabling the Watchdog Daemon with systemd

# Enable and start the watchdog daemon
sudo systemctl enable watchdog
sudo systemctl start watchdog

# Check its status
sudo systemctl status watchdog

# Check the daemon logs
sudo journalctl -u watchdog -f

Writing a systemd Service That Feeds the Watchdog

Alternatively, for simple embedded use cases, you might write your own minimal watchdog feeder as a systemd service:

# /etc/systemd/system/wdt-feeder.service
[Unit]
Description=Watchdog Keepalive Feeder
After=local-fs.target

[Service]
Type=simple
ExecStart=/usr/local/bin/wdt-feeder
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

And the wdt-feeder script:

#!/bin/bash
# /usr/local/bin/wdt-feeder

exec 3>/dev/watchdog
trap "echo V >&3; exec 3>&-; exit 0" SIGTERM SIGINT

while true; do
    echo "1" >&3          # Ping the watchdog
    sleep 15              # Must be less than the watchdog timeout
done

The trap on SIGTERM ensures that when systemd stops this service gracefully, it sends the magic close character and does not trigger an unwanted reboot.


12. Watchdog Timeout and What Happens When You Stop Pinging

Let’s be very explicit about this because it is critical and people get it wrong.

Watchdog Timeout Linux — How It Works Exactly

The watchdog timeout (set via WDIOC_SETTIMEOUT or module parameters) is the number of seconds the watchdog will wait for a keepalive ping before triggering a system reset.

If you set watchdog-timeout = 60 in your daemon config and interval = 10, then:

  • Every 10 seconds, the daemon pings the watchdog
  • The watchdog resets its 60-second countdown
  • If the daemon dies and no ping arrives for 60 seconds, the system reboots

What happens during the gap? The kernel (or hardware) is decrementing the timer. For a hardware watchdog, this is happening in dedicated circuitry. For softdog, a kernel timer callback is scheduled, and it calls emergency_restart() when it fires.

The nowayout Flag

The nowayout parameter is a security feature. When nowayout=1:

  • Once the watchdog is started (i.e., /dev/watchdog is opened), it cannot be stopped
  • Writing 'V' (magic close) has no effect
  • Even closing the file descriptor does not stop it
  • The only way to stop it is a system reboot or a hardware reset

This is ideal for production systems where you absolutely never want the watchdog to be accidentally disabled. A misbehaving process cannot disable the watchdog when nowayout=1.

What If the System Partially Hangs?

This is where the watchdog daemon’s health checks become really valuable. The daemon itself might still be running and pinging /dev/watchdog, but your critical application might have crashed or stopped responding.

This is why /etc/watchdog.conf supports options like:

  • Monitoring file timestamps (file, change)
  • Running test binaries (test-binary)
  • Checking network connectivity (ping)
  • Monitoring load average (max-load-1, max-load-5)

The daemon only pings the watchdog hardware if all these checks pass. If any check fails, the daemon stops pinging (or optionally runs a repair script first), and the hardware watchdog eventually fires and reboots the system.


13. Watchdog in Embedded Linux — Why It Matters

If you work on embedded Linux, the watchdog timer is not optional. It is one of those things that experienced embedded engineers treat as fundamental as memory management or interrupt handling.

Here is why embedded systems rely so heavily on hardware watchdog Linux functionality:

No human operator: Your device is deployed somewhere — a factory floor, a remote industrial site, an aircraft, a medical monitor. Nobody is sitting there ready to push the power button if it hangs.

Harsh environments: Embedded systems face electrical noise, power glitches, temperature extremes, and all kinds of real-world conditions that can cause transient software faults, memory corruption, or timing issues that lead to system hangs.

Long uptime requirements: A router or industrial controller might need to run for years without planned downtime. Without a watchdog, a single hang at 3 AM means everything is down until someone physically shows up.

Remote management difficulty: Even with SSH or serial console access, a fully frozen kernel means no remote recovery. The watchdog is your last resort.

Typical Embedded Watchdog Setup

In a typical embedded Linux system:

  1. The bootloader (U-Boot) configures the hardware watchdog early in the boot sequence
  2. U-Boot pings the watchdog while it is loading the kernel
  3. The kernel driver takes over and keeps the watchdog fed during boot
  4. Once user space is up, the watchdog daemon or application takes over the keepalive
  5. If any stage fails to feed the watchdog, the system automatically recovers

This is called a watchdog handoff chain, and getting it right is important. If there is any gap in the chain — say, the kernel takes too long to boot and the watchdog fires before the daemon starts — you get an early reboot loop.


14. Kernel Space vs User Space Watchdog

This is an important distinction that comes up when architecting watchdog-based reliability systems.

User Space Watchdog

Most production setups run the watchdog feeder in user space — either via the watchdog daemon or your own application directly writing to /dev/watchdog. This approach is great because:

  • If the user-space process crashes, the watchdog fires (which is what you want)
  • You can add application-level health checks easily
  • No kernel module modifications needed

But there is a subtle problem: if your kernel schedules the watchdog process properly but your application is deadlocked in kernel space (stuck in a driver, waiting on an I/O lock, etc.), the watchdog daemon can still run and ping the watchdog — even though your system is effectively unusable.

Kernel Space Watchdog

In some embedded designs, particularly safety-critical systems, the watchdog is fed from within the kernel itself — either from a kernel thread or from a critical subsystem. This gives stronger guarantees because you can ensure the kernel scheduler itself is still running, not just user space.

The hung_task_detection and softlockup kernel mechanisms complement the watchdog timer by detecting cases where the scheduler itself is stuck.

In the kernel config:

CONFIG_SOFTLOCKUP_DETECTOR=y
CONFIG_DETECT_HUNG_TASK=y
CONFIG_HARDLOCKUP_DETECTOR=y

When a softlockup or hardlockup is detected, the kernel can automatically trigger a watchdog reset — giving you kernel-level watchdog reboot protection even when user space is fine.


15. Linux Watchdog Configuration File Examples

Here are some real-world watchdog configuration scenarios.

Minimal Embedded System Config

# /etc/watchdog.conf - minimal embedded config
watchdog-device    = /dev/watchdog
watchdog-timeout   = 30
interval           = 10

Server with Load and Memory Monitoring

# /etc/watchdog.conf - server config with health checks
watchdog-device    = /dev/watchdog
watchdog-timeout   = 60
interval           = 15

# Load average thresholds (for 1/4/8 CPU system)
max-load-1         = 24
max-load-5         = 18
max-load-15        = 12

# Minimum free memory (in pages, 1 page = 4KB)
min-memory         = 1

# Ping the gateway to confirm network is alive
ping               = 192.168.1.1
ping-count         = 3

# Check that application is writing its heartbeat file
file               = /var/run/myapp.heartbeat
change             = 60

# Try to recover before rebooting
repair-binary      = /usr/local/bin/try-recovery.sh

log-dir            = /var/log/watchdog

Industrial Embedded with Nowayout

# /etc/watchdog.conf - industrial system, nowayout
watchdog-device    = /dev/watchdog
watchdog-timeout   = 15
interval           = 5

# Test the critical application
test-binary        = /opt/myapp/health-check.sh
test-timeout       = 60

# No repair attempt — just reboot fast

And the modprobe config to match:

# /etc/modprobe.d/watchdog.conf
options imx2_wdt nowayout=1

16. Common Mistakes and How to Debug Your Watchdog Driver

If you are writing or debugging a Linux watchdog driver, here are the pitfalls that bite almost everyone at least once.

Mistake 1 — Forgetting the Magic Close

Your test program opens /dev/watchdog, does something, then exits without writing 'V'. The system reboots 60 seconds later and you think your kernel is unstable. It is not — you just forgot the magic close.

Fix: Always write 'V' before closing the device in test programs. Use the trap pattern in bash scripts.

Mistake 2 — Setting Interval Greater Than Timeout

If your ping interval is longer than the watchdog timeout, the watchdog fires before you ping it. Your system keeps rebooting every X seconds and you cannot figure out why.

Fix: Make sure interval (in watchdog.conf or your code’s sleep time) is significantly less than the watchdog hardware timeout. A ratio of 3:1 or higher is recommended (e.g., ping every 10 seconds, timeout at 60 seconds).

Mistake 3 — Not Handling SETTIMEOUT Return Value

The kernel driver may round your requested timeout. If you request 45 seconds but the hardware only supports multiples of 8 seconds, you get 48. Always read back with WDIOC_GETTIMEOUT.

Mistake 4 — Race Condition in Driver’s start/stop

If your driver uses spinlocks or mutexes incorrectly in the start() and ping() operations, you can get a race between the timer callback and the ping — leading to either a missed reset or a spurious reset.

Fix: Use consistent locking in all paths that touch the timer.

Debugging Tips

# Check kernel messages for watchdog events
dmesg | grep -i "watchdog\|wdt"

# Check if the device is being used
lsof /dev/watchdog

# Check what the watchdog daemon is doing
sudo journalctl -u watchdog --no-pager -n 50

# Check loaded watchdog kernel modules
lsmod | grep -i "wdt\|watchdog"

# View /proc for watchdog info (if supported by driver)
cat /proc/sys/kernel/watchdog
cat /proc/sys/kernel/watchdog_thresh

17. FAQ — People Also Ask

What is watchdog driver in Linux?

A Linux watchdog driver is a kernel module that manages a watchdog timer — either hardware-based or software-based. It exposes the /dev/watchdog device to user space. Applications or the watchdog daemon periodically write to this device (“ping” it) to prevent the system from rebooting. If the ping stops coming within the configured timeout, the watchdog triggers an automatic system reset. It is a fundamental reliability mechanism used in servers, embedded systems, and anywhere Linux needs to self-recover from hangs.

How does watchdog timer work in Linux?

When you open /dev/watchdog, the countdown timer starts. Your application (or the watchdog daemon) must write to the device before the timer reaches zero. Each write resets the countdown. If the system hangs, the ping stops, the timer expires, and the watchdog triggers a reboot. For hardware watchdogs, this reset happens at the hardware level, independent of the CPU state. For software watchdogs (softdog), a kernel timer calls emergency_restart().

What is /dev/watchdog in Linux?

/dev/watchdog is a character device file (major 10, minor 130) that provides user-space access to the Linux watchdog subsystem. Opening it starts the watchdog timer. Writing to it sends a keepalive ping. Using ioctl() on it lets you get/set the timeout, query driver capabilities, check boot status, and more. Only one process can open it at a time. To safely stop the watchdog, write 'V' (the magic close character) before closing the file descriptor.

How to enable watchdog in Linux kernel?

Run make menuconfig in your kernel source directory, navigate to Device Drivers -> Watchdog Timer Support, and enable the relevant driver for your hardware (or softdog for software watchdog). Set it as built-in (y) for embedded systems. On a running system, load the module with sudo modprobe softdog (or your hardware-specific module name), and verify /dev/watchdog appears.

Why watchdog is used in embedded Linux?

Embedded Linux devices are often deployed without human operators nearby, need to run for months or years without planned maintenance, and operate in harsh environments that can cause transient faults. A hardware watchdog timer provides an automatic recovery mechanism — if the system hangs for any reason, the watchdog fires and reboots it, restoring normal operation without any human intervention. This is critical for routers, industrial controllers, medical devices, and automotive systems.

How to reset system using watchdog Linux?

The watchdog automatically resets the system when no keepalive ping is received within the timeout period. You can also trigger an intentional immediate reset by using the WDIOC_SETOPTIONS ioctl with the WDIOS_FORCERESET flag (if supported by your driver). For testing, you can simply stop pinging and wait for the timeout to expire — but make sure you really want a reboot first.

What is watchdog timeout in Linux?

The watchdog timeout is the number of seconds the watchdog timer will count down without a keepalive ping before triggering a system reset. It is configurable via the WDIOC_SETTIMEOUT ioctl command or via module parameters. After setting it, always read back the actual value with WDIOC_GETTIMEOUT because hardware watchdogs often round to the nearest supported value. Common timeout values range from 10 seconds to several minutes depending on the use case.

What happens if watchdog is not pinged?

If the watchdog timer reaches zero without receiving a keepalive ping, it triggers a system reset. For a hardware watchdog, this is a hard hardware reset — exactly like pulling the power and plugging it back in. For the software watchdog (softdog), the kernel calls emergency_restart(). After the reboot, you can check whether the watchdog caused the reset using the WDIOC_GETBOOTSTATUS ioctl, which returns WDIOF_CARDRESET if the last boot was watchdog-triggered.

How to write watchdog driver in Linux?

To write a Linux watchdog driver: include linux/watchdog.h, fill in a watchdog_ops struct with at minimum start(), stop(), and ping() functions, fill in a watchdog_device struct with your watchdog_ops, min/max timeout, and driver info, then call watchdog_register_device() in your probe function and watchdog_unregister_device() in your remove function. The watchdog framework handles all the character device boilerplate for you. See the full example code in Section 8 of this guide.


18. Real-World Watchdog Scenarios and Patterns

Before wrapping up, let’s look at a few real patterns that experienced engineers use in production systems. These go slightly beyond the basics but are important to know.

Pattern 1 — Two-Stage Watchdog Timeout

Some systems use a two-stage timeout approach. The first stage gives the system time to attempt recovery. The second stage triggers the hard reboot. You can simulate this with the repair-binary option in watchdog.conf:

# Stage 1: attempt repair (90 seconds)
repair-binary = /usr/local/bin/attempt-recovery.sh
# Stage 2: reboot if repair fails (overall watchdog timeout)
watchdog-timeout = 120
interval = 10

The repair script runs first. If it exits with 0, the daemon continues pinging and no reboot happens. If the script fails or times out, the daemon stops pinging, and the hardware watchdog fires.

Pattern 2 — Application Heartbeat Through watchdog.conf

Instead of your application writing to /dev/watchdog directly (which requires root and exclusive access), have it write a timestamp to a file. The watchdog daemon monitors that file and only pings the hardware watchdog if the file is being updated:

# /etc/watchdog.conf
watchdog-device  = /dev/watchdog
watchdog-timeout = 60
interval         = 10
file             = /var/run/myapp.heartbeat
change           = 30    # File must change within 30 seconds

Your application just does:

# In your app's main loop (no root needed, no exclusive device access)
touch /var/run/myapp.heartbeat
sleep 15

This is a much cleaner architecture for complex applications and means you do not need to give your application root privileges just for watchdog access.

Pattern 3 — Detecting Watchdog Resets in Your Application

When your system reboots due to a watchdog timeout, you almost certainly want to know about it. Here is a startup check pattern:

#include <linux/watchdog.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <syslog.h>

void check_watchdog_reset(void)
{
    int fd = open("/dev/watchdog", O_RDWR);
    int boot_status = 0;

    if (fd < 0) return;

    ioctl(fd, WDIOC_GETBOOTSTATUS, &boot_status);

    if (boot_status & WDIOF_CARDRESET) {
        syslog(LOG_CRIT, "WATCHDOG RESET DETECTED — system recovered from hang");
        /* Send alert to your monitoring system */
        /* Save a crash report */
        /* Increment a persistent reboot counter */
    }

    /* Now set up normal keepalive logic */
    /* ... */
}

Logging watchdog resets to persistent storage (or sending alerts) is essential for understanding your system’s reliability profile over time.

Pattern 4 — Watchdog in a Multi-Process System

In a complex system with multiple critical processes, none of them have exclusive access to /dev/watchdog (only one can open it). The standard pattern is:

  1. A single lightweight watchdog manager process opens /dev/watchdog and owns the keepalive loop.
  2. Each critical process writes to a named pipe, shared memory flag, or Unix socket to report it is alive.
  3. The watchdog manager only pings /dev/watchdog if all critical processes have checked in within the expected interval.
  4. If any process dies or stops responding, the manager stops pinging, and the watchdog fires.

This gives you system-level watchdog protection that is aware of application-level health, without every application needing direct watchdog access.

Working with Multiple Watchdog Devices

Modern systems sometimes expose multiple watchdog timers (for example, a SoC might have both a hardware watchdog and a PMU watchdog). In that case, you get /dev/watchdog0, /dev/watchdog1, etc. The /dev/watchdog symlink points to /dev/watchdog0 by default.

# Check all available watchdog devices
ls /dev/watchdog*

# The watchdog daemon can be configured for a specific device
watchdog-device = /dev/watchdog1

Each device is independent and managed by its own driver. You would typically only manage one in a given system configuration.


19. Conclusion

The Linux watchdog driver is one of those subsystems that is easy to overlook until you actually need it — and then you are really glad it exists. Whether you are working on a high-availability server that needs to self-recover from kernel hangs, or an industrial embedded system deployed in the middle of nowhere, the watchdog timer is your silent guardian.

Here is a quick recap of what we covered:

The Linux watchdog framework gives you a clean, standardized API. Hardware watchdogs are always preferred for production embedded systems because they survive kernel freezes. Software watchdog (softdog) is great for development, VMs, and systems without dedicated hardware. The /dev/watchdog device is your interface from user space — open it, ping it, close it with 'V' if you want to stop it gracefully. The watchdog daemon (watchdogd) plus /etc/watchdog.conf is the standard production setup on most Linux distributions. Writing your own driver is straightforward with the watchdog_ops and watchdog_device structures — fill in five functions and register with watchdog_register_device(). Module parameters like soft_margin and nowayout let you tune behavior at load time. Systemd integrates cleanly with the watchdog daemon for reliable production deployments.

If you are just getting started, load softdog on your development machine right now (sudo modprobe softdog), play with the example code from Section 2, and get comfortable with how /dev/watchdog behaves. Once you understand the keepalive ping mechanism and the magic close character, the rest falls into place naturally.

For embedded developers, spend time understanding the hardware watchdog specific to your SoC, configure nowayout=1 for production builds, and think carefully about your keepalive chain from bootloader to user space.

The Linux watchdog ecosystem is mature, well-documented in the kernel source under Documentation/watchdog/, and used in millions of production deployments worldwide. It is worth understanding deeply.

For detailed understanding of Platform Devices and Drivers on Linux, refer to the Linux documentation on Platform Devices and Drivers .

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Leave a Comment