Header RSS Feed
 
If you only want to see the articles of a certain category, please click on the desired category below:
ALL Android Backup BSD Database Hacks Hardware Internet Linux Mail MySQL Monitoring Network Personal PHP Proxy Shell Solaris Unix Virtualization VMware Windows Wyse

Create separate measurement tables in InfluxDB for Icinga 2 NRPE checks
Tuesday - Dec 12th 2017 - by - (2 comments)

In a previous article I wrote how Icinga 2 performance graphs can be created using InfluxDB and Grafana. At the end of the article I mentioned a special note concerning NRPE checks:

Note: For NRPE checks you will have to adapt the graphs because these performance data are stored in the "nrpe" measurements table. 

My monitoring architecture relys heavily on remotely executed checks using check_nrpe therefore almost all system related information (cpu, memory, network io, diskspace, etc) were collected in one and the same measurement table: nrpe:

root@inf-mon02-t:~# influx
Visit https://enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring.
Connected to http://localhost:8086 version 0.10.0
InfluxDB shell 0.10.0
> USE icinga2
Using database icinga2
> SHOW MEASUREMENTS
name: measurements
------------------
name
apt
disk
hostalive
http
icinga
load
ping4
ping6
procs
ssh
swap
users

At the begin of this year, in January 2017, I had some problems with PNP4Nagios and NRPE checks. I was unable to control the graph's behavior on certain remotely executed checks, because PNP4Nagios interpreted all the checks as the same plugin: check_nrpe. With a workaround (applying a special variable containing the NRPE check command) I was able to create separate PNP4Nagios templates for each individual remote NRPE check command (see article Creating custom PNP4Nagios template in Icinga 2 for NRPE checks for more details).
Where am I going with this? The same workaround can also be applied to the InfluxdbWriter object!

Fist I modified the apply rule which added the remote disk usage checks (you guessed it, using check_nrpe) on the Linux hosts:

apply Service "Diskspace " for (partition_name => config in host.vars.partitions) {
  import "generic-service"

  vars += config
  if (!vars.warn) { vars.warn = "15%" }
  if (!vars.crit) { vars.crit = "5%" }
  if (!vars.iwarn) { vars.iwarn = "15%" }
  if (!vars.icrit) { vars.icrit = "5%" }
  if (!vars.service) { vars.service = "generic-service" }

  import vars.service

  display_name = "Diskspace " + partition_name
  check_command = "nrpe"
  vars.nrpe_command = "check_disk"
  vars.nrpe_arguments = [ vars.warn, vars.crit, partition_name, vars.iwarn, vars.icrit ]
  vars.influx_append = "_$nrpe_command$"

  assign where host.address && host.vars.os == "Linux"
  ignore where host.vars.applyignore.partitions == true
}

Note: For more information about such advanced Icinga2 configurations using apply rules, take a look at Icinga 2: Advanced usage of arrays/dictionaries for monitoring of partition.

Take a look at the following line:

  vars.influx_append = "_$nrpe_command$"

Here I define a new variable influx_append. It is a string starting with an underscore (_) followed by the value of the variable nrpe_command. Which is actually check_disk as you can see two lines above it. Whenever this applied disk usage check is running, the service object now also contains the variable influx_append. This can now be used in the InfluxdbWriter.

The InfluxdbWriter feature object needs to be modified in a way, that the measurement table to use/create contains the value of the influx_append variable. And this is how I've done it:

root@inf-mon02-t:~# cat /etc/icinga2/features-enabled/influxdb.conf
/**
 * The InfluxdbWriter type writes check result metrics and
 * performance data to an InfluxDB HTTP API
 */

library "perfdata"

object InfluxdbWriter "influxdb" {
  //host = "127.0.0.1"
  //port = 8086
  //database = "icinga2"
  //flush_threshold = 1024
  //flush_interval = 10s
  //host_template = {
  //  measurement = "$host.check_command$"
  //  tags = {
  //    hostname = "$host.name$"
  //  }
  //}
  service_template = {
  //  measurement = "$service.check_command$"
    measurement = "$service.check_command$$influx_append$"
    tags = {
      hostname = "$host.name$"
      service = "$service.name$"
    }
  }
}

As you can see if kept the defaults, but un-commented the service_template part. The original measurement definition is still there (commented). I slightly modified it:

    measurement = "$service.check_command$$influx_append$"

So the measurement table to be used is now appended with new content. The nice thing is: This doesn't change anything for the local executed checks like http or ldap, because the variable influx_append is empty unless it comes from the NRPE disk usage check. On the other hand, as soon as a disk usage check through check_nrpe was executed, the variable contains information and appends the measurement like this: measurement = nrpe_check_disk .

After a restart of Icinga 2, the following can be seen in the debug logs (you must enable debug level in /etc/icinga2/features-enabled/mainlog.conf):

[2017-12-12 14:13:16 +0100] debug/InfluxdbWriter: Add to metric list: 'nrpe_check_disk,hostname=remoteserver01,service=Diskspace\ /var,metric=/var value=387973120 1513084396'.

Inside the InfluxDB this can be verified now:


root@inf-mon02-t:~# influx
Visit https://enterprise.influxdata.com to register for updates, InfluxDB server management, and monitoring.
Connected to http://localhost:8086 version 0.10.0
InfluxDB shell 0.10.0
> use icinga2
Using database icinga2
> show measurements
name: measurements
------------------
name
apt
disk
dns
hostalive
http
icinga
ldap
load
nrpe
nrpe_check_disk
ping4
ping6
procs
ssh
swap
users

Indeed, the measurement table nrpe_check_disk was created! Let's check the content:

> select * from nrpe_check_disk
name: nrpe_check_disk
---------------------
time            hostname        metric  service         value
1513084394000000000     remoteserver01    /var    Diskspace /var  3.9845888e+08
1513084395000000000     remoteserver02    /       Diskspace /     2.524971008e+09
1513084395000000000     remoteserver01    /tmp    Diskspace /tmp  1.048576e+06
1513084396000000000     remoteserver02    /var    Diskspace /var  3.8797312e+08
1513084396000000000     remoteserver02    /tmp    Diskspace /tmp  1.048576e+06
1513084451000000000     remoteserver01    /var    Diskspace /var  3.9845888e+08
1513084452000000000     remoteserver02    /       Diskspace /     2.524971008e+09
1513084452000000000     remoteserver01    /tmp    Diskspace /tmp  1.048576e+06
1513084454000000000     remoteserver02    /tmp    Diskspace /tmp  1.048576e+06
1513084454000000000     remoteserver02    /var    Diskspace /var  3.8797312e+08
1513084508000000000     remoteserver01    /var    Diskspace /var  3.9845888e+08
1513084510000000000     remoteserver02    /       Diskspace /     2.524971008e+09
1513084510000000000     remoteserver01    /tmp    Diskspace /tmp  1.048576e+06
1513084512000000000     remoteserver02    /var    Diskspace /var  3.8797312e+08
1513084512000000000     remoteserver02    /tmp    Diskspace /tmp  1.048576e+06

Success! Now I have my own measurement table for this type of remote check. This makes it easier for queries instead of having all the remote nrpe checks in one measurement table.

Update July 30th 2018

As you can see below in the comments, after the "influx_append" was added into the InfluxDB feature config, Icinga 2 writes a lot of warnings into /var/log/icinga2/icinga2.log like these:

[2018-07-30 09:45:57 +0200] warning/MacroProcessor: Macro 'influx_append' is not defined.
        (0) Resolving macros for string '$service.check_command$$influx_append$'

This happened to all the service checks which don't have a special variable "influx_append" defined, for example "http" or "ssh". I tired to define a global value "influx_append" in constants.conf (see my comment) but this didn't work.

However when I defined a service variable in the service template (vars.influx_append) and set it to empty, all the warnings were gone (because it is now a defined variable). This is how I did it.

Basically all my services are using different templates (based on check times, criticality, etc). But all the different templates have one thing in common: A base template. And in this base template I defined this variable:

# cat /etc/icinga2/zones.d/global-templates/templates/service-base-template.conf
################################################################
# SERVICE TEMPLATE DEFINITIONS
################################################################
# service-base
# This service template is being inherited by other service templates
# Use it for settings which apply on ALL services
#################################
template Service "service-base" {
        notes_url = "/pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$"
        check_period = "24x7"
        vars.influx_append = ""
}

As this is the lowest definition of a service, the variable will be overwritten by a service later (see above's rule for the "apply Service Diskspace").

 

Add a comment

Show form to leave a comment

Comments (newest first):

ck from Switzerland wrote on Jul 17th, 2018:
VerboEse, I have the same problem here with the warnings on all other services. I did not fix it yet but an idea would be to define influx_append on top level. Maybe in constants, but not sure if that works. Otherwise on all other services (influx_append = "") but that's kind of overkill, I agree.

VerboEse wrote on Jul 16th, 2018:
Hi.
The idea is great! There is one problem though: for services not done via nrpe I get errors in my icinga log:
---
[2018-07-16 18:44:30 +0200] warning/MacroProcessor: Macro 'influx_append' is not defined.
Context:
(0) Resolving macros for string '$service.check_command$$influx_append$'
(1) Processing check result for 'icinga.mydomain.cxm!cluster'
---
I don't understand the definition enough for getting rid of these.


Go to Homepage home
Linux Howtos how to's
Monitoring Plugins monitoring plugins
Links links

Valid HTML 4.01 Transitional
Valid CSS!
[Valid RSS]

7036 Days
until Death of Computers
Why?