-
Notifications
You must be signed in to change notification settings - Fork 29
Home
Nagios is an open source computer system monitoring, network monitoring and infrastructure monitoring software application. Nagios, originally created under the name NetSaint, was written and is currently maintained by Ethan Galstad along with a group of developers who are actively maintaining both the official and unofficial plugins.
The Nagios Plugins for Linux are intended to be run by NRPE, the Nagios Remote Plugin Executor, that "allows you to remotely execute Nagios plugins on other Linux/Unix machines. This allows you to monitor remote machine metrics (disk usage, CPU load, etc.)."
LNX_CLOCK - returns the number of seconds elapsed between local time and Nagios time
[/etc/nrpe.d/check_clock]
command[check_clock]=/usr/lib/nagios/plugins/check_clock --refclock $ARG1$ -w 60 -c 120
where $ARG1$
is the number of seconds since the "Epoch"
(1970-01-01 00:00:00 UTC) -- $(date '+%s')
provided by the Nagios poller.
This check is intended for alerting when the number of seconds elapsed between the Nagios poller and the monitored server exceeds a given threshold (60 seconds for the warning state, and 120 seconds for a critical notification, in the example above). The clock of the Nagios server needs, of course, to be synchronized to an NTP server.
This plugin returns the number of seconds elapsed between
the host local time and Nagios time.
Copyright (C) 2014 Davide Madrisan <[email protected]>
Usage:
check_clock [-w COUNTER] [-c COUNTER] --refclock TIME
Options:
-r, --refclock COUNTER the clock reference (in seconds since the Epoch)
-w, --warning COUNTER warning threshold
-c, --critical COUNTER critical threshold
-v, --verbose show details for command-line debugging
(Nagios may truncate output)
-h, --help display this help and exit
-V, --version output version information and exit
Examples:
check_clock -w 60 -c 120 --refclock $ARG1$
# where $ARG1$ is the number of seconds since the Epoch: "$(date '+%s')"
# provided by the Nagios poller
clock OK - time delta 39s | clock_delta=39
clock_delta
LNX_UPTIME - check how long the system has been running
[ /etc/nrpe.d/check_uptime ]
command[check_uptime]=/usr/lib/nagios/plugins/check_uptime
command[check_uptime_notify]=/usr/lib/nagios/plugins/check_uptime --critical 30:
In the example above, a notification will be sent by Nagios when the uptime of the monitored server will be less than 30 minutes. This will catch, for instance, an unexpected reboot of a servers caused by a non-maskable interrupt (a signal of a non-recoverable hardware error).
This new Nagios plugin is based on the POSIX function clock_gettime()
associated with the clock monotonic option (CLOCK_MONOTONIC
).
According to the POSIX specifications "the value returned by clock_gettime()
represents the amount of time (in seconds and nanoseconds) since an unspecified point in the past (for example, system start-time, or the Epoch)".
The (recent) Linux kernels returns a value that is somehow related to the system start-time but can be different from the output of the command uptime (procps), or the first value of /proc/uptime
.
$ /usr/bin/uptime
18:45:00 up 8:46, 7 users, load average: 0.67, 1.79, 2.49
$ awk '{printf("%02d:%02d\n",($1/60/60%24),($1/60%60))}' /proc/uptime
08:46
$ ./clock_monotonic
4 hours 37 min
(On OpenBSD 5.0, the clock monotonic function returns the same value as uptime, which is confirming this behaviour is platform dependent).
The implementation followed by nagios-plugins-linux is compatible with uptime and /proc/uptime.
This plugin checks how long the system has been running.
Copyright (C) 2010,2012-2014 Davide Madrisan <[email protected]>
Usage:
check_uptime [OPTION]
Options:
-m, --clock-monotonic use the monotonic clock for retrieving the time
-w, --warning PERCENT warning threshold
-c, --critical PERCENT critical threshold
-h, --help display this help and exit
-V, --version output version information and exit
Examples:
check_uptime
check_uptime --critical 15: --warning 30:
check_uptime --clock-monotonic -c 15: -w 30:
See the Nagios Developer Guidelines for range format:
<https://nagios-plugins.org/doc/guidelines.html#THRESHOLDFORMAT>
uptime OK: 23 hours 56 min | uptime=1436
uptime
(in minutes)