Difference between revisions of "Nagios"

Jump to navigation Jump to search
1,700 bytes added ,  10:03, 6 September 2012
→‎Create SNMP Checks: Added standard disk status check
(Added "Check Tuning" and Meta)
(→‎Create SNMP Checks: Added standard disk status check)
Line 46: Line 46:
|-
|-
| <code> .1.3.6.1.4.1.24681.1.2.17.1.5.1 </code>  || System Volume 1 Space || <code> 1.74 TB </code>
| <code> .1.3.6.1.4.1.24681.1.2.17.1.5.1 </code>  || System Volume 1 Space || <code> 1.74 TB </code>
|-
| <code> .1.3.6.1.4.1.24681.1.2.11.1.4.1 </code>  || Physical Disk 1 Status || <code> ready </code>
|-
|-
| <code> .1.3.6.1.4.1.24681.1.2.11.1.7.1 </code>  || Physical Disk 1 SMART Status || <code> GOOD </code>
| <code> .1.3.6.1.4.1.24681.1.2.11.1.7.1 </code>  || Physical Disk 1 SMART Status || <code> GOOD </code>
Line 58: Line 60:
I created a new file, called <code>/etc/nagios3/conf.d/commands_qnap.cfg</code> and added the following...
I created a new file, called <code>/etc/nagios3/conf.d/commands_qnap.cfg</code> and added the following...


==== System Temperature ====
  define command{
  define command{
         command_name    check_qnap_sys_temp
         command_name    check_qnap_sys_temp
Line 70: Line 73:
* <code> -u C </code> - The units of the metric being checked (appears in the check's Status Information column in Nagios display)
* <code> -u C </code> - The units of the metric being checked (appears in the check's Status Information column in Nagios display)


 
==== Volume Status ====
  define command{
  define command{
         command_name    check_qnap_sysvol_status
         command_name    check_qnap_sysvol_status
Line 79: Line 82:
* <code> -r "Ready" </code> - The text expected back from the poll, anything else causes a critical error
* <code> -r "Ready" </code> - The text expected back from the poll, anything else causes a critical error


 
==== Volume Space ====
  define command{
  define command{
         command_name    check_qnap_sysvol_space
         command_name    check_qnap_sysvol_space
Line 89: Line 92:
** <code>.iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemVolumeTable.SysVolumeEntry.SysVolumeFreeSize.$ARG1$</code>
** <code>.iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemVolumeTable.SysVolumeEntry.SysVolumeFreeSize.$ARG1$</code>


==== Disk Status ====
define command{
        command_name    check_qnap_disk_status
        command_line    /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -o .1.3.6.1.4.1.24681.1.2.11.1.4.$ARG1$ -m /etc/nagios3/mibs/QNAP-NAS.mib -l "Disk Status" -r 0
        }
* <code> -o .1.3.6.1.4.1.24681.1.2.11.1.7.$ARG1$ </code> - The SNMP OID being checked, similar to above $ARG1$ is used as a command parameter so that I can create separate checks for the individual disks without creating a separate check command for each.
** <code>.iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemHdTable.HdEntry.HdStatus.$ARG1$</code>
* <code> -m /etc/nagios3/mibs/QNAP-NAS.mib </code> - Path to the QNAP MIB file.  The value returned is an integer, 0 for ready/good, a negative value for a fault.  In order to translate the value (eg <code>-9</code>) to its actual meaning (eg <code>rwError</code>), Nagios needs access to the MIB file.  You will need to download it from your NAS (from the Network Services | SNMP Settings page), and copy it to path indicated on your Nagios server.
* <code> -r 0 </code> - The data expected back from the poll, 0 maps to <code>ready</code>anything else causes a critical error


==== Disk SMART Status ====
  define command{
  define command{
         command_name    check_qnap_disk_status
         command_name    check_qnap_disk_smart_status
         command_line    /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -o .1.3.6.1.4.1.24681.1.2.11.1.7.$ARG1$ -l "SMART Info State" -r "GOOD"
         command_line    /usr/lib/nagios/plugins/check_snmp -H '$HOSTADDRESS$' -o .1.3.6.1.4.1.24681.1.2.11.1.7.$ARG1$ -l "SMART Info State" -r "GOOD"
         }
         }
Line 98: Line 111:
* <code> -r "GOOD" </code> - The text expected back from the poll, anything else causes a critical error
* <code> -r "GOOD" </code> - The text expected back from the poll, anything else causes a critical error


==== Disk Temperature ====
  define command{
  define command{
         command_name    check_qnap_disk_temp
         command_name    check_qnap_disk_temp
Line 104: Line 118:
* <code> -o .1.3.6.1.4.1.24681.1.2.11.1.3.$ARG1$ </code> - The SNMP OID being checked, as above $ARG1$ is used as a command parameter so that I can create separate checks for the individual disks without creating a separate check command for each.
* <code> -o .1.3.6.1.4.1.24681.1.2.11.1.3.$ARG1$ </code> - The SNMP OID being checked, as above $ARG1$ is used as a command parameter so that I can create separate checks for the individual disks without creating a separate check command for each.
** <code>.iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemHdTable.HdEntry.HdTemperature.$ARG1$</code>
** <code>.iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemHdTable.HdEntry.HdTemperature.$ARG1$</code>


=== Create Services ===
=== Create Services ===
Line 144: Line 157:
         service_description    Status Disk 1
         service_description    Status Disk 1
         check_command          check_qnap_disk_status!1
         check_command          check_qnap_disk_status!1
        }
define service{
        use                    generic-service
        hostgroup_name          qnap-nas
        service_description    SMART Disk 1
        check_command          check_qnap_disk_smart_status!1
         }
         }


Navigation menu