Creating an InfluxDB asynchronous replication using subscription service

Written by - 1 comments

Published on November 13th 2018 - Listed in DB Databases InfluxDB Monitoring Icinga

Today a full disk corrupted my InfluxDB 0.8 metrics database for Icinga2 monitoring and I was unable to recover the data.
I found quite some issues with the same errors I found in the logs but all of them were fixed in more recent versions. Luckily this database is not in production yet so this is probably (and forcibly) the day to use a newer InfluxDB version.

InfluxDB 0.8 came from the Ubuntu repositories itself. And it featured a cluster setup! Unfortunately newer versions do not support a cluster setup anymore, unless you buy the license for the Enterprise Edition of InfluxDB. That was the reason why I thought I'd just stay on 0.8. But, as mentioned, a lot of bugfixes and improvements happened since then. 

However I didn't want to give up on the cluster or to have at least a standby InfluxDB that the graphs can still be shown, even if the primary monitoring server is down. This is when I came across subscriptions.

According to the documentation, the receiving InfluxDB copies data to known subscribers. Imagine this like a mail server sending a newsletter to registered subscribers (this comparison really helps, doesn't it? You're welcome!). In order to do that, the subscriber service needs to be enabled in the InfluxDB config (usually /etc/influxdb/influxdb.conf):

  # Determines whether the subscriber service is enabled.
  enabled = true

Restart InfluxDB after this setting change.

On the master server you need to define the known subscribers:

root@influx01:/# influx
Connected to http://localhost:8086 version 1.6.4
InfluxDB shell version: 1.6.4
> CREATE SUBSCRIPTION "icinga-replication" ON "icinga"."autogen" DESTINATIONS ALL ''

So what does it do?

Obviously a new subscription with the unique ID "icinga-replication" is created. It covers the database "icinga" and sets a retention policy of "autogen".
As destination the transport over http was chosen and endpoint is, which is InfluxDB running on host influx02.

By showing the subscriptions (SHOW SUBSCRIPTIONS), this can be confirmed:

name: icinga
retention_policy name               mode destinations
---------------- ----               ---- ------------
autogen          icinga-replication ALL  []

From now on every record written into the influx01 instance is copied to to influx02 and will show up with such entries on influx02:

influx02 influxd[5045]: [httpd] - icinga [13/Nov/2018:14:23:41 +0100] "POST /write?consistency=&db=icinga&precision=ns&rp=autogen HTTP/1.1" 204 0 "-" "InfluxDBClient" 564b790a-e747-11e8-8b0c-000000000000 13708

But what about authentication?

When authentication is enabled (which should always be the case on a database), the following error message appears in the log file on the master:

influx01 influxd[7232]: ts=2018-11-13T10:18:11.142719Z lvl=info msg="{\"error\":\"unable to parse authentication credentials\"}\n" log_id=0Bk8uDzG000 service=subscriber

In this case, the subscription needs to be added with credentials in the URL string:

> CREATE SUBSCRIPTION "icinga-replication" ON "icinga"."autogen" DESTINATIONS ALL 'http://dbuser:password@'

Note: I used the same subscription ID again (icinga-replication). In order to do so, the existing subscription must be removed:

> DROP SUBSCRIPTION "icinga-replication" ON "icinga"."autogen"

And what about data prior to the subscription "replication"?

The subscription service only copies incoming data. This means that older data is not copied over to the subscribers. In this case you need to transfer the data using dump/restore or even by syncing the data (/var/lib/influxdb/data) and wal (/var/lib/influxdb/wal) directories.
In my case I stopped InfluxDB on both influx01 and influx02, rsynced the content of /var/lib/influxdb/data and /var/lib/influxdb/wal from influx01 to influx02 and then started InfluxDB again. First on host influx02, then on influx01. Now data is in sync (until a network issue or similar happens).

As this graphing database is not production critical I can live with that situation.

Add a comment

Show form to leave a comment

Comments (newest first)

droidrider from France wrote on Dec 22nd, 2018:

Thank you very much for this article.
It work smoothly with InfluxDB 1.7.1.