Today a full disk corrupted my InfluxDB 0.8 metrics database for Icinga2 monitoring and I was unable to recover the data.
I found quite some issues with the same errors I found in the logs but all of them were fixed in more recent versions. Luckily this database is not in production yet so this is probably (and forcibly) the day to use a newer InfluxDB version.
InfluxDB 0.8 came from the Ubuntu repositories itself. And it featured a cluster setup! Unfortunately newer versions do not support a cluster setup anymore, unless you buy the license for the Enterprise Edition of InfluxDB. That was the reason why I thought I'd just stay on 0.8. But, as mentioned, a lot of bugfixes and improvements happened since then.
However I didn't want to give up on the cluster or to have at least a standby InfluxDB that the graphs can still be shown, even if the primary monitoring server is down. This is when I came across subscriptions.
According to the documentation, the receiving InfluxDB copies data to known subscribers. Imagine this like a mail server sending a newsletter to registered subscribers (this comparison really helps, doesn't it? You're welcome!). In order to do that, the subscriber service needs to be enabled in the InfluxDB config (usually /etc/influxdb/influxdb.conf):
[subscriber]
# Determines whether the subscriber service is enabled.
enabled = true
Restart InfluxDB after this setting change.
On the master server you need to define the known subscribers:
root@influx01:/# influx
Connected to http://localhost:8086 version 1.6.4
InfluxDB shell version: 1.6.4
> SHOW SUBSCRIPTIONS
> CREATE SUBSCRIPTION "icinga-replication" ON "icinga"."autogen" DESTINATIONS ALL 'http://10.10.1.2:8086'
So what does it do?
Obviously a new subscription with the unique ID "icinga-replication" is created. It covers the database "icinga" and sets a retention policy of "autogen".
As destination the transport over http was chosen and endpoint is 10.10.1.2:8086, which is InfluxDB running on host influx02.
By showing the subscriptions (SHOW SUBSCRIPTIONS), this can be confirmed:
> SHOW SUBSCRIPTIONS
name: icinga
retention_policy name mode destinations
---------------- ---- ---- ------------
autogen icinga-replication ALL [http://10.10.1.2:8086]
From now on every record written into the influx01 instance is copied to to influx02 and will show up with such entries on influx02:
influx02 influxd[5045]: [httpd] 10.10.1.1 - icinga [13/Nov/2018:14:23:41 +0100] "POST /write?consistency=&db=icinga&precision=ns&rp=autogen HTTP/1.1" 204 0 "-" "InfluxDBClient" 564b790a-e747-11e8-8b0c-000000000000 13708
When authentication is enabled (which should always be the case on a database), the following error message appears in the log file on the master:
influx01 influxd[7232]: ts=2018-11-13T10:18:11.142719Z lvl=info msg="{\"error\":\"unable to parse authentication credentials\"}\n" log_id=0Bk8uDzG000 service=subscriber
In this case, the subscription needs to be added with credentials in the URL string:
> CREATE SUBSCRIPTION "icinga-replication" ON "icinga"."autogen" DESTINATIONS ALL 'http://dbuser:password@10.10.1.2:8086'
Note: I used the same subscription ID again (icinga-replication). In order to do so, the existing subscription must be removed:
> DROP SUBSCRIPTION "icinga-replication" ON "icinga"."autogen"
The subscription service only copies new incoming data. This means that older data is not copied over to the subscribers. In this case you need to transfer the data using dump/restore or even by syncing the data (/var/lib/influxdb/data) and wal (/var/lib/influxdb/wal) directories.
In my case I stopped InfluxDB on both influx01 and influx02, rsynced the content of /var/lib/influxdb/data and /var/lib/influxdb/wal from influx01 to influx02 and then started InfluxDB again. First on host influx02, then on influx01. Now data is in sync (until a network issue or similar happens).
As this graphing database is not production critical I can live with that situation.
droidrider from France wrote on Dec 22nd, 2018:
Thank you very much for this article.
It work smoothly with InfluxDB 1.7.1.
AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Database Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Influx Internet Java KVM Kibana Kodi Kubernetes LVM LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Office PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder