Difference between revisions of "Nagios"

Jump to navigation Jump to search
m (Reverted edits by Ipodsoft (talk) to last revision by Sstrutt)
 
Line 3: Line 3:


Nagios is centred around device polling (it can receive SNMP traps, but its a more advanced feature), and the presentation of state data.  Though the first thing to appreciate is that Nagios doesn't actually do any monitoring, at its core it's a task scheduling and state management engine.  It needs third party '''plugins''', which do the actual monitoring a report back the state of the host you're monitoring to it.  There are plugins provided out-of-the-box, which will probably achieve most (if not all) of what you want.
Nagios is centred around device polling (it can receive SNMP traps, but its a more advanced feature), and the presentation of state data.  Though the first thing to appreciate is that Nagios doesn't actually do any monitoring, at its core it's a task scheduling and state management engine.  It needs third party '''plugins''', which do the actual monitoring a report back the state of the host you're monitoring to it.  There are plugins provided out-of-the-box, which will probably achieve most (if not all) of what you want.
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


== Terminology ==
== Terminology ==
Line 14: Line 10:
* '''command''' - A command is command line call of a plugin with one or more parameters, which defines how you might use a plugin to test a host.
* '''command''' - A command is command line call of a plugin with one or more parameters, which defines how you might use a plugin to test a host.
* '''service''' - A service is something that you care about on a host, that you want to test (eg web server response, ping, disk space, CPU,  
* '''service''' - A service is something that you care about on a host, that you want to test (eg web server response, ping, disk space, CPU,  
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


== Useful Paths etc ==
== Useful Paths etc ==
Line 34: Line 26:
| <code> service nagios3 restart </code>  || Restart service (reloads config - will fail if config is invalid!)
| <code> service nagios3 restart </code>  || Restart service (reloads config - will fail if config is invalid!)
|}
|}
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


== Create SNMP Checks ==
== Create SNMP Checks ==
Everything here creates various checks for my '''QNAP NAS''', which I've used as an example.
Everything here creates various checks for my '''QNAP NAS''', which I've used as an example.
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


=== Define OID's to Poll ===
=== Define OID's to Poll ===
Line 52: Line 36:


Having downloaded the MIB and done some probing GetIf, I've decided I need to monitor the following OID's...
Having downloaded the MIB and done some probing GetIf, I've decided I need to monitor the following OID's...
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


{|class="vwikitable"
{|class="vwikitable"
Line 79: Line 59:


I created a new file, called <code>/etc/nagios3/conf.d/commands_qnap.cfg</code> and added the following...
I created a new file, called <code>/etc/nagios3/conf.d/commands_qnap.cfg</code> and added the following...
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


==== System Temperature ====
==== System Temperature ====
Line 96: Line 72:
* <code> -l Temp </code> - A label for the check (appears in the check's Status Information column in Nagios display)
* <code> -l Temp </code> - A label for the check (appears in the check's Status Information column in Nagios display)
* <code> -u C </code> - The units of the metric being checked (appears in the check's Status Information column in Nagios display)
* <code> -u C </code> - The units of the metric being checked (appears in the check's Status Information column in Nagios display)
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


==== Volume Status ====
==== Volume Status ====
Line 109: Line 81:
** <code>.iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemVolumeTable.SysVolumeEntry.SysVolumeStatus.$ARG1$</code>
** <code>.iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemVolumeTable.SysVolumeEntry.SysVolumeStatus.$ARG1$</code>
* <code> -r "Ready" </code> - The text expected back from the poll, anything else causes a critical error
* <code> -r "Ready" </code> - The text expected back from the poll, anything else causes a critical error
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


==== Volume Space ====
==== Volume Space ====
Line 123: Line 91:
* <code> -c $ARG2$: </code> - The critical threshold, defining it as a command parameter allows me to alter the service threshold without altering the command definition. The trailing <code> : </code> makes it a ''should be more than'' check rather than the normal ''should be less than'' check.
* <code> -c $ARG2$: </code> - The critical threshold, defining it as a command parameter allows me to alter the service threshold without altering the command definition. The trailing <code> : </code> makes it a ''should be more than'' check rather than the normal ''should be less than'' check.
** <code>.iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemVolumeTable.SysVolumeEntry.SysVolumeFreeSize.$ARG1$</code>
** <code>.iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemVolumeTable.SysVolumeEntry.SysVolumeFreeSize.$ARG1$</code>
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


==== Disk Status ====
==== Disk Status ====
Line 146: Line 110:
** <code>.iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemHdTable.HdEntry.HdSmartInfo.$ARG1$</code>
** <code>.iso.org.dod.internet.private.enterprises.storage.storageSystem.SystemInfo.SystemHdTable.HdEntry.HdSmartInfo.$ARG1$</code>
* <code> -r "GOOD" </code> - The text expected back from the poll, anything else causes a critical error
* <code> -r "GOOD" </code> - The text expected back from the poll, anything else causes a critical error
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


==== Disk Temperature ====
==== Disk Temperature ====
Line 212: Line 172:
         check_command          check_qnap_disk_temp!1
         check_command          check_qnap_disk_temp!1
         }
         }
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]




Line 239: Line 196:


In general its better to make such changes to generic templates, that can then be applied to one or more service checks.  You can then edit changes centrally, rather than going round and updating services.  Templates can be daisy chained so that subsequent templates override or add to config (see http://nagios.sourceforge.net/docs/3_0/objectinheritance.html for further info).
In general its better to make such changes to generic templates, that can then be applied to one or more service checks.  You can then edit changes centrally, rather than going round and updating services.  Templates can be daisy chained so that subsequent templates override or add to config (see http://nagios.sourceforge.net/docs/3_0/objectinheritance.html for further info).
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


=== Check Frequency ===
=== Check Frequency ===
Line 295: Line 248:
         check_command          check_wib_svc
         check_command          check_wib_svc
         }
         }
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


== Ubuntu Software Updates Monitor ==
== Ubuntu Software Updates Monitor ==
Line 330: Line 279:


   
   
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]
=== SNMP Based (Michal Ludvig) ===
=== SNMP Based (Michal Ludvig) ===
'''The check script that is called by SNMP doesn't work!  I've left this here for the time being as the remote SNMP exec mechanism does work, and I expect to use it at some point.  When I do, I'll remove this, and document that instead.'''
'''The check script that is called by SNMP doesn't work!  I've left this here for the time being as the remote SNMP exec mechanism does work, and I expect to use it at some point.  When I do, I'll remove this, and document that instead.'''
Line 381: Line 327:
         notification_interval          0 ; set > 0 if you want to be renotified
         notification_interval          0 ; set > 0 if you want to be renotified
  }
  }
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


== NRPE ==
== NRPE ==
Line 401: Line 343:
* smbclient
* smbclient
* snmp
* snmp
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


=== Setup ===
=== Setup ===
Line 475: Line 413:
         notification_interval          0 ; set > 0 if you want to be renotified
         notification_interval          0 ; set > 0 if you want to be renotified
  }
  }
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


== Web Site Content and Response Time Monitoring ==
== Web Site Content and Response Time Monitoring ==
Line 489: Line 423:


Therefore I took one that almost did, <code>[http://exchange.nagios.org/directory/Plugins/Websites%2C-Forms-and-Transactions/check_http_content/details check_http_content]</code>, and modified it to match my requirements (which I'll upload to the exchange once I've got it working with the <code>Nagios::Plugin</code> Perl module), and called it <code>[http://dl.sandfordit.com/scripts/check_url_content check_url_content]</code> (for the time being its available via the previous link).
Therefore I took one that almost did, <code>[http://exchange.nagios.org/directory/Plugins/Websites%2C-Forms-and-Transactions/check_http_content/details check_http_content]</code>, and modified it to match my requirements (which I'll upload to the exchange once I've got it working with the <code>Nagios::Plugin</code> Perl module), and called it <code>[http://dl.sandfordit.com/scripts/check_url_content check_url_content]</code> (for the time being its available via the previous link).
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


=== Script Options ===
=== Script Options ===
Line 527: Line 457:
| Host (optional when Username specified), should be in the following format 'www.domain.com:443'
| Host (optional when Username specified), should be in the following format 'www.domain.com:443'
|}
|}
'''Source(s):'''  [http://downloadranking.com/support.php  Nagios]


=== Examples ===
=== Examples ===