Monitoring Zimbra with Nagios

From vwiki
Revision as of 13:18, 20 September 2012 by Sstrutt (talk | contribs) (Added Spellcheck and DNS checks)
Jump to navigation Jump to search

This page covers Zimbra application specific monitoring checks, OS specific checks (eg low disk space, or high CPU) are not included. The majority of it was gleamed from this forum thread - http://www.zimbra.com/forums/administrators/20029-zimbra-monitoring-nagios.html

All my Zimbra specific checks are applied to specific hosts via a zimbra-servers hostgroup. Create a group in your hostgroups_nagios2.cfg file to contain you Zimbra hosts, for example...

define hostgroup {
       hostgroup_name  zimbra-servers
       alias           Zimbra servers
       members         mail.domain.com,mail2.domain.com
       }

To apply any changes to Nagios configuration, run the following commands

  • nagios3 -v /etc/nagios3/nagios.cfg
    • Performs basic validation of the config
  • service nagios3 restart
    • Restarts the Nagios service

Core Service Checks

Note that not all checks will be appropriate for your installation. For example, if you only allow user access over secure links then the IMAP and HTTP / Web Client checks will not apply.

SMTP

Checks that your server can receive incoming email.

define service{
       use                     zimbra-svc-template     ; Inherit default values from a template
       hostgroup_name          zimbra-servers
       service_description     SMTP
       check_command           check_smtp
       }

Postfix

Checks that your server doesn't have queued outgoing email. Requires NRPE to installed and running 1st.

On the Zimbra server...

  1. Update the path to postfix used by NRPE, in /usr/lib/nagios/plugins/utils.pm
    • EG $PATH_TO_MAILQ = "/opt/zimbra/postfix/sbin/mailq";
  2. Add to NRPE config /etc/nagios/nrpe.cfg, update the warning and critical thresholds to suit your environment
    • command[check_zimbra_mailq]=/usr/lib/nagios/plugins/check_mailq -w 10 -c 20 -M postfix
  3. Restart NRPE to apply config change
    • service nagios-nrpe-server restart

Add a new service (on Nagios server)...

define service {
       use                     zimbra-svc-template
       hostgroup_name          zimbra-servers
       service_description     Postfix MailQ
       check_command           check_nrpe!check_zimbra_mailq
}

IMAP

Checks that your server can be accessed by IMAP clients (eg mobile device, desktop email client)

define service{
       use                     zimbra-svc-template     ; Inherit default values from a template
       hostgroup_name          zimbra-servers
       service_description     IMAP
       check_command           check_imap
       }

IMAP over SSL

Checks that your server can be accessed by IMAP clients (eg mobile device, desktop email client) over SSL.

define command{
	command_name    check_imaps
	command_line    /usr/lib/nagios/plugins/check_imap -H $HOSTADDRESS$ -p 993 -S
       }
define service{
       use                     zimbra-svc-template     ; Inherit default values from a template
       hostgroup_name          zimbra-servers
       service_description     IMAP SSL
       check_command           check_imaps
       }

HTTP / Web Client

Checks that the web-mail interface is accessible to users.

You'll probably already have a basic HTTP service check defined, in which case just add your Zimbra servers to the http-servers hostgroup in hostgroups_nagios2.cfg

HTTPS / Web Client

Checks that the web-mail interface is accessible to users.

You may already have a basic HTTP service check defined, in which case just add your Zimbra servers to the https-servers hostgroup in hostgroups_nagios2.cfg

If not create a new service...

define service {
       hostgroup_name                  https-servers
       service_description             HTTPS
       check_command                   check_https
       use                             generic-service
       notification_interval           0 ; set > 0 if you want to be renotified
}

Then create a http-servers hostgroup in hostgroups_nagios2.cfg, for example...

define hostgroup {
       hostgroup_name  https-servers
               alias           HTTPS servers
               members         mail.domain.com,mail2.domain.com
       }

System Checks

Clam AV Service

Checks that the Clam Anti Virus service is running. Requires NRPE to be installed and running 1st.

On the Zimbra server, add the following to /etc/nagios/nrpe.cfg

command[check_clamd]=/usr/lib/nagios/plugins/check_clamd -H localhost

Add a new service (on Nagios server)...

define service {
       use                     zimbra-svc-template
       hostgroup_name          zimbra-servers
       service_description     ClamAV Svc
       check_command           check_nrpe!check_clamd
}

Clam AV Updates

Requires both NRPE to be installed and running 1st, and also a 3rd party script to perform the check.

On the Zimbra server

  1. Install Perl Net::DNS module
    • apt-get install libnet-dns-perl
  2. Download the script to /usr/lib/nagios/plugins/ and make executable
  3. Edit the following script lines to suit the paths on your server
    • Line 31: use lib "/usr/lib/nagios/plugins";
    • Line 39: my $clamd_cmd = "/opt/zimbra/clamav/sbin/clamd";
    • Line 166: chomp(my $clamd_ver = `$clamd_cmd -V -c /opt/zimbra/conf/clamd.conf`);
  4. Edit /etc/sudoers so that it can be run successfully by nagios sudoed to zimbra
    • nagios ALL=(zimbra) NOPASSWD: /usr/lib/nagios/plugins/check_clamav
    • If you need to add write access, use chmod o+w /etc/sudoers , then revert back with chmod 0440 /etc/sudoers
  5. Add to NRPE config /etc/nagios/nrpe.cfg
    • command[check_zimbra_clam_ud]=sudo -u zimbra /usr/lib/nagios/plugins/check_clamav -w 2 -c 5
  6. Restart NRPE to apply config change
    • service nagios-nrpe-server restart

Add a new service (on Nagios server)...

define service {
       use                     zimbra-svc-template
       hostgroup_name          zimbra-servers
       service_description     ClamAV Updates
       check_command           check_nrpe!check_zimbra_clam_ud
}

LMTP

Checks that LMTP (used internal routing of emails through Clam, to mailboxes, etc) is functioning. Requires NRPE to be installed and running 1st

On the Zimbra server, add the following to /etc/nagios/nrpe.cfg. Note that you need to update the FQDN for your server in the expected return field.

command[check_zimbra_lmtp]=/usr/lib/nagios/plugins/check_smtp -H localhost -p 7025 -e '220 mail.domain.com Zimbra LMTP server ready'

Add a new service (on Nagios server)...

define service {
       use                     zimbra-svc-template
       hostgroup_name          zimbra-servers
       service_description     LMTP
       check_command           check_nrpe!check_zimbra_lmtp
}


Spellcheck

Checks that the spelling checker service is available. Requires NRPE to installed and running 1st.

On the Zimbra server, add the following to /etc/nagios/nrpe.cfg

command[check_zimbra_spell]=/usr/lib/nagios/plugins/check_http -H localhost -p 7780

Add a new service (on Nagios server)...

define service {
       use                     zimbra-svc-template
       hostgroup_name          zimbra-servers
       service_description     Zimbra Spelling
       check_command           check_nrpe!check_spell
}


Dependency Checks

DNS

Checks that your Zimbra server is able to make DNS lookups. Requires NRPE to installed and running 1st.

Note that if you need to consider which DNS server(s) it is most appropriate to check against. For example, if you run a local Bind instance on your Zimbra server, you may want to test both that, and the DNS server which it forwards requests to.

On the Zimbra server, add the following to /etc/nagios/nrpe.cfg

command[check_dns]=/usr/lib/nagios/plugins/check_dns -H google.com

If you need to resolution again a DNS server different to the default (as specified in /etc/resolv.conf use the -s option, for example...

command[check_dns]=/usr/lib/nagios/plugins/check_dns -H google.com -s 8.8.8.8

Add a new service (on Nagios server)...

define service {
       use                     zimbra-svc-template
       hostgroup_name          zimbra-servers
       service_description     DNS Resolution
       check_command           check_nrpe!check_dns
}