Nagios: Difference between revisions
m (→Define OID's to Poll: Revised wording) |
m (→Create Commands: Finished off) |
||
| Line 137: | Line 137: | ||
command_line /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -o .1.3.6.1.4.1.24681.1.2.11.1.7.$ARG1$ -l "SMART Info State" | command_line /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -o .1.3.6.1.4.1.24681.1.2.11.1.7.$ARG1$ -l "SMART Info State" | ||
} | } | ||
* <code> -o .1.3.6.1.4.1.24681.1.2.11.1.7.$ARG1$ </code> - The SNMP OID being checked, similar to above $ARG1$ is used as a wildcard so that I can create seperate checks for the individual disks without creating a separate check command for each. | |||
** <code>.iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemHdTable.HdEntry.HdSmartInfo.$ARG1$</code> | |||
define command{ | define command{ | ||
| Line 143: | Line 144: | ||
command_line /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -o .1.3.6.1.4.1.24681.1.2.11.1.3.$ARG1$ -w 45 -c 55 -l Temp -u C | command_line /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -o .1.3.6.1.4.1.24681.1.2.11.1.3.$ARG1$ -w 45 -c 55 -l Temp -u C | ||
} | } | ||
* <code> -o .1.3.6.1.4.1.24681.1.2.11.1.3.$ARG1$ </code> - The SNMP OID being checked, as above $ARG1$ is used as a wildcard so that I can create seperate checks for the individual disks without creating a separate check command for each. | |||
** <code>.iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemHdTable.HdEntry.HdTemperature.$ARG1$</code> | |||
Revision as of 08:48, 31 August 2011
| Path | Description |
|---|---|
/etc/nagios3/conf.d |
Config files |
/etc/nagios-plugins/config |
Plugin commands |
/usr/lib/nagios/plugins |
Plugin executables |
nagios3 -v /etc/nagios3/nagios.cfg |
Config check |
service nagios3 restart |
Restart service (reloads config) |
./usr/share/nagios ./usr/lib/nagios ./var/lib/nagios
define service{
use generic-service ; Inherit default values from a template
hostgroup_name zimbra-servers
service_description IMAP
check_command check_imap
}
define service{ use generic-service ; Inherit default values from a template hostgroup_name zimbra-servers service_description SMTP check_command check_smtp }
- check that MySQL services are up
define service {
hostgroup_name mysql-servers
service_description MySQL
check_command check_mysql
use generic-service
notification_interval 0 ; set > 0 if you want to be renotified
}
define command{
command_name check_http_auth
command_line /usr/lib/nagios/plugins/check_http -H '$HOSTADDRESS$' -I '$HOSTADDRESS$' -a '$ARG1$'q
}
define service{
use generic-service ; Name of service template to use
host_name localhost
service_description HTTP
check_command check_http_auth!user:pass ; Enter actual user/pass
}
define hostextinfo{
hostgroup_name debian-servers
notes Debian GNU/Linux servers
icon_image base/debian.png
icon_image_alt Debian GNU/Linux
vrml_image debian.png
statusmap_image base/debian.gd2
}
define hostextinfo{
hostgroup_name ubuntu-servers
notes Ubuntu servers
icon_image base/ubuntu.png
icon_image_alt Ubuntu
vrml_image ubuntu.png
statusmap_image base/ubuntu.gd2
}
Create SNMP Checks
Everything here creates various checks for my QNAP NAS, which I've used as an example.
Define OID's to Poll
Before you start you need to know what SNMP OID's you want to poll, and what they're values should be. For common devices and metrics you can often get by with a Google search or two, but it doesn't take much for you to need to get a bit more involved.
When it comes to investigating what OID's you can poll for a specific device your friend is GetIf.
Having downloaded the MIB and done some probing GetIf, I've decided I need to monitor the following OID's...
| OID | Description | Example Return Data |
|---|---|---|
.1.3.6.1.4.1.24681.1.2.6.0 |
System Temperature | 41 C/105 F
|
.1.3.6.1.4.1.24681.1.2.17.1.6.1 |
System Volume 1 Status | Ready
|
.1.3.6.1.4.1.24681.1.2.17.1.5.1 |
System Volume 1 Space | 1.74 TB
|
.1.3.6.1.4.1.24681.1.2.11.1.7.1 |
Physical Disk 1 SMART Status | GOOD
|
.1.3.6.1.4.1.24681.1.2.11.1.3.1 |
Physical Disk 1 Temperature | 35 C/95 F
|
Create Commands
Each type of check needs a command defined for it, in which you can have flexibility in that if you've certain checks that will be similar (eg checks for status of disk 1, disk 2 etc) then you can add arguments to the checks that can be defined later on. I created a new file, called /etc/nagios3/conf.d/commands_qnap.cfg and added the following...
define command{
command_name check_qnap_sys_temp
command_line /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -o .1.3.6.1.4.1.24681.1.2.6.0 -w 45 -c 55 -l Temp -u C
}
-H '$HOSTADDRESS$'- This is a standard wildcard for all check commands, Nagios substitutes the device's IP address-o .1.3.6.1.4.1.24681.1.2.6.0- The SNMP OID being checked **.iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemTemperature.0-w 45- The warning threshold-c 55- The critical threshold-l Temp- A label for the check (appears in the checks Status Information column in Nagios display)-u C- The units of the metric being checked (appears in the checks Status Information column in Nagios display)
define command{
command_name check_qnap_sysvol_status
command_line /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -o .1.3.6.1.4.1.24681.1.2.17.1.6.$ARG1$ -l "Volume Status"
}
-o .1.3.6.1.4.1.24681.1.2.17.1.6.$ARG1$- The SNMP OID being checked, $ARG1$ is used as a wildcard so that if I had more than one volume I could repeat the check for volume 1, 2 etc without creating a separate check command for each..iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemVolumeTable.SysVolumeEntry.SysVolumeStatus.$ARG1$
define command{
command_name check_qnap_sysvol_space
command_line /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -o .1.3.6.1.4.1.24681.1.2.17.1.5.$ARG1$ -w $ARG2$: -c $ARG3$: -l "Volume Space" -u TB
}
-o .1.3.6.1.4.1.24681.1.2.17.1.5.$ARG1$- The SNMP OID being checked, as above $ARG1$ is used as a wildcard so that if I had more than one volume I could repeat the check for volume 1, 2 etc without creating a separate check command for each..iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemVolumeTable.SysVolumeEntry.SysVolumeFreeSize.$ARG1$
define command{
command_name check_qnap_disk_status
command_line /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -o .1.3.6.1.4.1.24681.1.2.11.1.7.$ARG1$ -l "SMART Info State"
}
-o .1.3.6.1.4.1.24681.1.2.11.1.7.$ARG1$- The SNMP OID being checked, similar to above $ARG1$ is used as a wildcard so that I can create seperate checks for the individual disks without creating a separate check command for each..iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemHdTable.HdEntry.HdSmartInfo.$ARG1$
define command{
command_name check_qnap_disk_temp
command_line /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -o .1.3.6.1.4.1.24681.1.2.11.1.3.$ARG1$ -w 45 -c 55 -l Temp -u C
}
-o .1.3.6.1.4.1.24681.1.2.11.1.3.$ARG1$- The SNMP OID being checked, as above $ARG1$ is used as a wildcard so that I can create seperate checks for the individual disks without creating a separate check command for each..iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemHdTable.HdEntry.HdTemperature.$ARG1$