Modules
There are a number of modules included with resmon that will cover most things you need to monitor. A list of the modules is below, along with a sample configuration. You can also create your own modules.
Generic configuration options
The following options are applicable to any module:
- interval : cache the result for n seconds. Useful for long running modules. Default: do not cache
A1000
This module monitors the health of an A1000 Storedge disk array.
Sample Configuration
A1000 {
fa000_001 : status => Optimal
}
Arguments
- Object : the unit you wish to monitor
- status : the status that you consider to be OK
DATE
A simple module that just prints the current unix timestamp. This can be useful when using the status.txt file to ensure that you have up to date information. However, when using the XML checks, this module is no longer necessary as each check includes information on when it was last updated.
Sample Configuration
DATE {
date : noop
}
DHCPLEASES
This module checks the amount of active dhcp leases for a network and warns if the amount grows close to the maximum amount of addresses available in the dhcp pool.
Sample Configuration
DHCPLEASES {
10.0.0 : warn => 15, crit => 25
192.168.0 : warn => 25, crit => 45
}
Arguments
- Object : the network you wish to check the leases for
- warn : The amount of leases above which you want to warn
- crit : The amount of leases above which you want to go critical
DISK
This module checks the amount of free disk space using df.
Sample Configuration
DISK {
/data1 : limit => 95%, warnat => 70%
/data2 : limit => 95%, warnat => 70%
/data3 : limit => 95%
Arguments
- Object : the mount point or device for which you want to check free space
- limit : the percentage at which you want to go critical
- warnat : (optional) the percentage at which you want to warn
DNS
This module checks the status of the bind dns server.
Sample Configuration
DNS {
dns : key => /dns/etc/rndc.key
}
Arguments
- Object : this is just a label used to identify the check
- key : the path to the dns key used by rdnc
ECCMGR
This module connects with the Ecelerity eccmgr and ensures that it is running.
Sample Configuration
ECCMGR {
eccmgr : socket => /tmp/2026
}
Arguments
- Object : this is just a label used to identify the check
- socket : the path to the socket to connect to eccmgr
Notes
This is one of the checks that requires a special module to connect. This is best achieved by running resmon using the version of perl that comes with ecelerity.
FAULTS
This module checks for any hardware faults using the fmadm command.
Sample Configuration
FAULTS {
hardware : noop
}
FILEAGE
This module monitors the age of a specific file, going bad if the file is too old or new.
Sample Configuration
FILEAGE {
/path/to/file : minimum => 30, maximum => 3600
/path/to/file2 : maximum => 7200
}
Arguments
- Object : the path to the file you wish to monitor
- minimum : (optional) the minimum age of the file in seconds you consider to be OK
- maximum : (optional) the maximum age of the file in seconds you consider to be OK
FILECOUNT
This module monitors the number of files in a directory, going bad when the file count goes over a threshold.
Sample Configuration
FILECOUNT {
/path/to/dir : slimit => 10, hlimit => 20
}
Arguments
- Object : the path to the directory
- slimit : the 'soft' threshold, above which the module will warn
- slimit : the 'hard' threshold, above which the module will go critical
FILESIZE
This module monitors the size of a specific file, going bad if it is too big or too small.
Sample Configuration
FILESIZE {
/path/to/file : minimum => 1, maximum => 16384
}
Arguments
- Object : the path to the file you want to monitor
- minimum : the minimum file size, in bytes
- maximum : the maximum file size, in bytes
FRESHSVN
This module checks a subversion checkout to make sure it is up to date and pointing to the correct url. See also the SIMPLESVN module, which doesn't perform as thorough a check, but has fewer requirements and works with older versions of subversion.
Sample Configuration
FRESHSVN {
/opt/resmon : URL => https://labs.omniti.com/resmon/trunk
}
Arguments
- Object : Path to the working copy
- URL : the url that the working copy should be checked out from
- maxlag : (optional, default 330 seconds) the amount of time you allow for the repository to update before the repository should be considered out of date. It's a good idea to set this to the interval at which your update cron job runs + a few seconds.
INODES
This module monitors the amount of free inodes on a filesystem.
Sample Configuration
INODES {
/ : limit => 90%
/data : limit => 90%
}
Arguments
- Object : The filesystem you wish to monitor
- limit : the percentage of inodes used after which you want to alarm
LARGEFILES
This module looks for 'large' files in a directory.
Sample Configuration
LARGEFILES {
/path/to/dir : limit => 16384
}
Arguments
- Object : the directory you wish to monitor
- limit : the maximum file size in bytes
LOGFILE
This module monitors a log file, looking for errors. What the module considers an error is configurable.
Sample Configuration
LOGFILE {
/var/log/mylogfile : max => 4, match => ^ERROR:
}
Arguments
- Object : path to the log file
- match : regex that defines what an error is
- max : (optional, default 8) the maximum amount of errors you will allow before going critical
NETBACKUPTAPE
This module checks the status of tape drives in netbackup, and will go critical if any are down, or there are no drives up.
Sample Configuration
NETBACKUPTAPE {
tapes : noop
}
NETSTAT
This module checks the output of netstat, as its name suggests. It can be used to ensure that a server is listening on a specified port, or that a certain connection is currently open.
Sample Configuration
NETSTAT {
ssh : state => LISTEN, localport => 22
sshtunnel : state => ESTABLISHED, remoteip => 10.0.0.1, remoteport => 22
}
Arguments
- Object : this is just a label used to identify the check
- state : the connection state (e.g. LISTEN, ESTABLISHED)
- localport : (optional) the local port of the connection
- localip : (optional) the local ip of the connection
- remoteport : (optional) the remote port of the connection
- remoteip : (optional) the remote ip of the connection
NEWFILES
This module ensures a directory has files modified later than a certiain time (for example, checking that new logfiles are being generated)
Sample Configuration
NEW {
/test/dir : minutes => 5, filecount => 2
/other/dir : minutes => 60
}
Arguments
- Object : the directory to monitor
- minutes : how old can a file be before we no longer consider it 'new'
- filecount : (optional, default 1) how many new files do we require to be new
OLDFILES
This module checks for files in a directory that are older than a certain time.
Sample Configuration
OLDFILES {
/test/dir : minutes => 5, filecount => 2, checkmount => 1
/other/dir : minutes => 60
}
Arguments
- Object : the directory to monitor
- minutes : how old can the files be before we alarm
- checkmount : check to make sure the directory is mounted first (only enable if the dir you are checking is the mountpoint of a filesystem)
- filecount : how many old files will we allow before alarming. If this is not set, then we will alarm if any files are old.
PGREP
This module checks for running processes using the pgrep command.
Sample Configuration
PGREP {
dhcpd : arg0 => em0, arg1 => em1
}
Arguments
- Object : the process name you are looking for
- arg1-arg3: (optional) the arguments that must have been passed to the command
QUEUESIZE
This module checks the size of ecelerity mail queues (delayed or active) and alerts if any are over a certain limit.
Sample Configuration
QUEUESIZE {
aol.com : queue => delayed, count => 3000
yahoo.com : queue => delayed, count => 3000
msn.com : queue => delayed, count => 3000
hotmail.com : queue => delayed, count => 3000
common : queue => delayed, count => 3000
}
Arguments
- Object : the domain name of the queue you wish to monitor, or 'common'. Picking common will monitor all queues except for major ISPs.
- queue : which queue to monitor (active or delayed)
- count : the number of messages allowed in the queue before we alarm
REMOTEFILESIZE
This module checks the size of a remote file using ssh. This requires that passwordless ssh be set up for root from one machine to the other.
Sample Configuration
REMOTEFILESIZE {
/path/to/file : host => other.example.com, minimum => 1, maximum => 131072
}
Arguments
- Object : the path to the file to be monitored
- host : the hostname of the server
- minimum : the minimum file size in bytes
- maximum : the maximum file size in bytes
RESMON
This module monitors resmon itself and reports if there is a problem with the config file or if there are any failed modules. This is most useful in conjunction with auto updating, when modules are reloaded without restarting resmon.
Note: at some point, this module may be added by default, but at the moment it needs to be included in the config file.
This check will also report the subversion revision number if resmon is running from a checkout.
Sample Configuration
RESMON {
resmon : noop
}
SCRIPT
This module runs a perl script and expects some output from the script in the form of "STATUS(message)". This allows resmon to run helper scripts without needing to write a complete module.
Sample Configuration
SCRIPT {
myscript : script => /path/to/myscript.pl, timeout => 300
}
Arguments
- Object : This is just a label used to identify the check
- script : the path to the perl script
- timeout : (optional, default 30) how long to cache the result of the command for in seconds
SIMPLESVN
This module, like the FRESHSVN module, checks for the health of a subversion checkout, making sure it is up to date and that there are no problems. It does not check that the working copy is checked out from a specific repository, nor does it have any grace period. However, it will work with older versions of subversion and may be preferable to the FRESHSVN module in some circumstances.
Sample Configuration
SIMPLESVN {
/path/to/working/copy : noop
}
Arguments
- Object : the path to the working copy
SMFMAINTENANCE
This module checks for any solaris services in maintenance mode.
Sample Configuration
SMFMAINTENANCE {
services : noop
}
SWAPSIZE
This module monitors the memory used on Solaris by inspecting the usage of the /tmp directory.
Sample Configuration
SWAPSIZE {
swap : limit => 262144
}
Arguments
- Object : this is just a label used to identify the check
- limit : the minimum amount of free memory below which we go critical
TCPSERVICE
This module connects to a tcp service at regular intervals, going critical if the connection fails.
Sample Configuration
TCPSERVICE {
ssh : host => 127.0.0.1, port => 22, timeout => 2
}
Arguments
- Object : this is just a label used to identify the check
- host : the host to connect to
- port : the port to connect to
- timeout : how long to wait for a connection before going critical in seconds
- prepost : (optional) a string to send on connection. Useful if the service you are checking requires something to be entered before showing a banner.
WALCHECK
This module monitors the postgresql log file replay from a master to a slave.
Sample Configuration
WALCHECK {
check_pg_replay_mode : logdir => /data/postgres/82/pg_log
}
Arguments
- Object : this is just a label used to identify the check
- logdir : the location of the logs
ZIMBRA
This module checks zimbra's service status and goes critical if any services are down.
Sample Configuration
ZIMBRA {
services : noop
}
ZPOOLERRS
This module checks for zpool read write errors by using zpool status -x. It will also notify if a zpool is degraded or not, similar to the basic zpool check.
This check can be used either in combination with the ZPOOL check or instead of it. If used in combination, it is probably a good idea to warn or email when the ZPOOLERRS check goes bad, and page when the ZPOOL check goes bad. If you wish to page on read/write errors as well as degraded arrays, then only the ZPOOLERRS check is required.
Sample Configuration
ZPOOLERRS {
zpools : noop
}
ZPOOLFREE
This module monitors the free space in a zfs pool using the zpool command. Often, it is more informative to use this module rather than the DISK module if your filesystems are all part of a zpool. Otherwise, what happens is that when the disk is full, every filesystem based check goes to 100% full and it isn't obvious what the cause is.
Sample Configuration
ZPOOLFREE {
pool1 : limit => 90%
pool2 : limit => 90%
}
Arguments
- Object : the name of the zpool you wish to monitor
- limit : how full to get before going critical
ZPOOL
This module looks for degraded zpools, but does not go critical if there are any recoverable errors that do not cause the array to be degraded.
Sample Configuration
ZPOOL {
zpools : noop
}
