Creating a custom Nagios/Centreon passive alerter

Scenario:

This is very probably a familiar problem that any sysadmin must solve: Using your monitoring setup to receive alerts from other servers. Note the highlighted "receive". This is different from the standard poll / check that come out of the box with Nagios. Say for example, you would like to receive an alert whenever a remote server finishes a task, such as a backup. Or you would like a remote server to send an alert whenever a certain keyword appears in it’s logs. In nagios-speak these are classified as "passive services", i.e. the nagios server passively waits for traps coming from the remote servers.

 

The reasons behind building your own:

There’s a lot of debate over here because there are already a couple of solutions to tackle the above problem. It boils down to how flexible you are, and how many feature you need. A quick discussion of some alternatives you may consider before going ahead

 

  • Use SNMP traps
  •  

This is by far the most standard option to implement, but it is also the most limited in my opinion. While it is easy to setup SNMP traps to alert you whenever someting standard goes wrong (such as a high cpu, or high memory), it is much more difficult to customise SNMP traps to alert you on non-standard objects such as log filtering, etc (in fact documentation in this department of SNMP is a bit lacking). So I do use SNMP traps, but only for what it’s good at, i.e. the standard checks. When more granular checks are needed, SNMP traps were too unweildy

  • Use NCSA or send_gearman
These solutions come very close to what I wanted to do, and they are a lot more flexible than SNMP since your own bash scripts are easy to plugin to these frameworks. But there was one major drawback for me: the fact that they need a seperate agent and/or libraries installed on the remote server…. The less I install on my servers, the less can go wrong. Besides almost anything is possible to script, so why not build a script that can monitor / backup / [insert action here] and send a notification to nagios natively?

 

So I set out to do exactly that. The end result is quite simple, a solution that consists of three parts:
  1. A passive service defined in centreon / nagios, that is controlled via the external command file in nagios
  2. A xinetd server that listens for the remote server traps, listening on the nagios server
  3. A client-side script which, other than performing it’s duties such as backing up files, monitoring logs, etc, also sends a trap to the nagios server
If you are a bash junkie, the above solution is definitely the way to go. In more detail:

 

1.   Setting up the passive service in centreon

 

First we define a command which doesnt really do anything. It just sits there doing nothing, but we need this command in order to bind it to a service in centreon. A service must have at least one command. So under Configuration > Commands > Add, define a service of type "check". I used the following and named it "notifier":

 

$USER1$/check_centreon_dummy -s 0 -o "No new notifiers"

 

The check_centreon_dummy script doesnt really do anything except say "OK". Next we define the actual passive service. Under Configuration > Services > Add I added a service using the following attributes under "Service Configuration":

 

Description: check_notifier
Volatile: yes
Check Command: notifier
Active Check Enabled: no
Passive Checks Enabled: yes

 

Under "Data Processing" tab ensure that check freshness is set to "no".

 

That’s all that’s needed…. (simpler than setting up SNMP traps also huh?)

 

2.   Writing the Xinetd server

 

This is done on the nagios server itself. First of all you need to be sure xinitd is installed. In my case since i’m not using it for anything else I also disabled any other daemons. I will be calling my server "alerter". So under /etc/xinetd.d/ create a file called "alerter" and use:

 

# This is the configuration for the alerter to send notifications vis Nagios / Centreon

service alerter
{
# This is for quick on or off of the service
        disable         = no
 
# The next attributes are mandatory for all services
        wait            = no
        socket_type     = stream

 

# External services must fill out the following
        user            = nagios
        group         = nagios
        server         = /documents/alerter.sh

 

# External services not listed in /etc/services must fill out the next one
        port            = 3333
}

 

Here i’ve used port 3333, chose any port you wish. Just be sure to modify /etc/services to include something similar to:
# Local services
alerter         3333/tcp

 

Now, under /documents/alerter.sh, we have a very simple script:
#!/bin/bash

read message
echo $message >> /usr/local/nagios/var/rw/nagios.cmd

 

This script reads the input from port 3333 and places into nagios’ external command file, nagios.cmd. There is plenty on documentation on how to use this file here:

 

 

3.   Writing a client-side script to send alerts to our passive server.

 

So the last step is using the server. In this example, my script actually performs a backup, and then should alert us via centreon that the backup succeeded. I won’t paste the whole script here, we’re only interested in sending the alert, whatever it is, to centreon:

 

#!/bin/bash

#here we get the time in unix format to use later#
timestamp=`date +%s`

[ …… script actions ……. ]

#here using bash sockets we open a connection to our passive server… note this is a cool native bash ability, no external libraries#
#also note the IP…change this to you nagios server IP#
exec 3<>/dev/tcp/1.1.1.1/3333

#here we send our message to the server… notice the "PROCESS_SERVICE_CHECK_RESULT… see the second link above for more details#
echo "[$timestamp] PROCESS_SERVICE_CHECK_RESULT;Backups;OTRS_backup;0;OK – OTRS Backup Successful" >&3

 

#it is important to close the bash sockets…#
3<&-
3>&-

 

Regarding some more information on bash sockets, have a quick look at the informative post here:


Now of course you can modify the script every which way… make it perform any number of checks, insert any logic, and output any result to the nagios passive service…… and this is what I was after from the very beginning🙂

 

You should have a nice alert showing up, something like: