By wjaiken on Dec 01, 2008
This is the first in a series of blogs on installing, configuring and running Nagios on Solaris. If you're not familiar with nagios, check out nagios.org to learn about this great open source network monitoring application written by Ethan Galstad.
I jumped in by downloading the source, building with gcc on OpenSolaris 2008.05 and deploying on a single OpenSolaris x64 system and a Solaris 10 Sparc system. Some things worked immediately, and some didn't.
Then I ran across this book: "Building a Monitoring Structure with Nagios" by David Josephsen.
See it here: http://www.skeptech.org/?page_id=4
I recommend reading this book cover-to-cover to understand all the issues involved in monitoring remote hosts and applications with Nagios. Nagios is an example of a highly configurable application which requires the user to be aware of security and performance issues when configuring the network for monitoring. A poor job done in configuration will result in sluggish monitoring capability and unhappy users, so knowing the issues up front is key to using Nagios successfully.
This book provides you with a great look behind the scenes, facilitates understanding how each piece of Nagios works and how to configure the app for each device on your network. The book is written from a Linux perspective, but the author points out some of the differences for \*nix and Windows systems.
The book describes Nagios 2 usage. I downloaded and built Nagios 3.0.5, so I'm sure there are a few updates to the author's directions.
Nagios has several components:
1. Nagios proper (sometimes referred to as the daemon) which runs on your monitoring host. It decides when to collect information on each monitored host,initiates actions to collect the data, and writes the collected information to log files. The user selects which hosts/devices are to be monitored, how frequently to gather the information, how much information (disk and CPU usage, application status, etc.) to gather and whom to notify when a user-defined problem is detected.
The user specifies all of this in configuration files. As you can imagine, the number and contents of the config files can become quite extensive and Josephsen devotes chapters 4 and 5 to describing the form and contents of these files, as well as some semi-automatic means to aid the user in creating the files.
2. Nagios GUI is a cgi-based visual monitoring tool to provide the user with a summary of the network and monitored application status in a convenient format. Events triggering warnings or escalations are highlighted in yellow or red respectively. More about the GUI in future blogs.
3. Nagios plugins are small executable files (scripts or binaries) which run on each of the remote hosts and collect one piece of information (number of users, percentage of disk full, etc). There are over 200 of these files which are downloadable from nagios.org.
4. The remaining Nagios piece is the connection mechanism which links the monitored data on the remote host with the Nagios daemon. The most commonly used mechanism is the check_nrpe script, running on the Nagios daemon host, and the NRPE daemon running on each monitored remote host.
Galstad's documentation on NRPE and check_nrpe is available at http://nagios.sourceforge.net/docs/nrpe/NRPE.pdf . This document contains handy drawings to depict how the two components work together to transfer remote host information to the daemon.
That's all for this blog.
blogs, I will address unique considerations for getting Nagios up
and running on your OpenSolaris and Solaris 10 systems. Send me email
me with any Solaris-specific
Nagios issues and I'll try to help you out.