InfluxDB backup on localhost fails with error: failed copy backup to file: err=<nil>

Written by - 0 comments

Published on March 14th 2019 - last updated on March 14th 2019 - Listed in DB Databases InfluxDB


I'm currently creating a script to automatically backup the databases from an InfluxDB. After the initial successful run of the backup script (which uses influxd backup in the background) I stumbled across the following errors when I ran the plugin a second (and more often) time:

2019/03/14 09:48:56 Download shard 151 failed copy backup to file: err=<nil>, n=0.  Waiting 2s and retrying (0)...
2019/03/14 09:48:58 Download shard 151 failed copy backup to file: err=
<nil>, n=0.  Waiting 2s and retrying (1)...
2019/03/14 09:49:00 Download shard 151 failed copy backup to file: err=
<nil>, n=0.  Waiting 2s and retrying (2)...
2019/03/14 09:49:02 Download shard 151 failed copy backup to file: err=
<nil>, n=0.  Waiting 2s and retrying (3)...
2019/03/14 09:49:04 Download shard 151 failed copy backup to file: err=
<nil>, n=0.  Waiting 2s and retrying (4)...
2019/03/14 09:49:06 Download shard 151 failed copy backup to file: err=
<nil>, n=0.  Waiting 2s and retrying (5)...
2019/03/14 09:49:08 Download shard 151 failed copy backup to file: err=
<nil>, n=0.  Waiting 3.01s and retrying (6)...
2019/03/14 09:49:11 Download shard 151 failed copy backup to file: err=
<nil>, n=0.  Waiting 11.441s and retrying (7)...
2019/03/14 09:49:23 Download shard 151 failed copy backup to file: err=
<nil>, n=0.  Waiting 43.477s and retrying (8)...
2019/03/14 09:50:06 Download shard 151 failed copy backup to file: err=
<nil>, n=0.  Waiting 2m45.216s and retrying (9)...
2019/03/14 09:52:51 backup failed: copy backup to file: err=
<nil>, n=0
backup: copy backup to file: err=
<nil>, n=0

The syslog entries from the Influxd daemon revealed a little bit more:

 Mar 14 09:49:34 inf-monix02-p influxd[5153]: ts=2019-03-14T08:49:34.924850Z lvl=info msg="Write failed" log_id=0BkKXKwW000 service=write shard=151 error="engine: error writing WAL entry: write /var/lib/influxdb/wal/icinga/autogen/151/_02519.wal: no space left on device"
Mar 14 09:49:35 inf-monix02-p influxd[5153]: ts=2019-03-14T08:49:35.703478Z lvl=info msg="Cache snapshot (start)" log_id=0BkKXKwW000 engine=tsm1 trace_id=0EAr7Rql000 op_name=tsm1_cache_snapshot op_event=start
Mar 14 09:49:35 inf-monix02-p influxd[5153]: ts=2019-03-14T08:49:35.703533Z lvl=info msg="Cache snapshot (end)" log_id=0BkKXKwW000 engine=tsm1 trace_id=0EAr7Rql000 op_name=tsm1_cache_snapshot op_event=end op_elapsed=0.068ms
Mar 14 09:49:35 inf-monix02-p influxd[5153]: ts=2019-03-14T08:49:35.703550Z lvl=info msg="Error writing snapshot" log_id=0BkKXKwW000 engine=tsm1 error="error opening new segment file for wal (1): write /var/lib/influxdb/wal/icinga/autogen/151/_02519.wal: no space left on device"
Mar 14 09:49:36 inf-monix02-p influxd[5153]: ts=2019-03-14T08:49:36.305281Z lvl=info msg="Write failed" log_id=0BkKXKwW000 service=write shard=151 error="engine: error writing WAL entry: write /var/lib/influxdb/wal/icinga/autogen/151/_02519.wal: no space left on device"

The interesting thing however is that there was still enough disk space available:

$ df -h /var/lib/influxdb/
Filesystem               Type  Size  Used Avail Use% Mounted on
/dev/vglxc/inf-monix02-p ext4   50G   37G   13G  75% /

The initial size of that partition was 30GB and was dynamically increased to 50GB (this is a LXC container so I was able to resize the root partition online). Maybe InfluxDB still had the original disk size in memory? Let's test this theory and restart InfluxDB:

# systemctl restart influxdb

And try the backup script again:

# ./backup-influxdb.sh
Clearing /backup
Thu Mar 14 09:57:32 CET 2019: Starting Dump of all databases
2019/03/14 09:57:32 backing up metastore to /backup/meta.00
2019/03/14 09:57:32 No database, retention policy or shard ID given. Full meta store backed up.
2019/03/14 09:57:32 Backing up all databases in portable format
2019/03/14 09:57:32 backing up db=
2019/03/14 09:57:32 backing up db=_internal rp=monitor shard=146 to /backup/_internal.monitor.00146.00 since 0001-01-01T00:00:00Z
[...]
2019/03/14 09:58:37 backing up db=icinga rp=autogen shard=133 to /backup/icinga.autogen.00133.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:41 backing up db=icinga rp=autogen shard=142 to /backup/icinga.autogen.00142.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:46 backing up db=icinga rp=autogen shard=151 to /backup/icinga.autogen.00151.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:48 backing up db=mtr rp=autogen shard=37 to /backup/mtr.autogen.00037.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:48 backing up db=mtr rp=autogen shard=44 to /backup/mtr.autogen.00044.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:48 backing up db=mtr rp=autogen shard=53 to /backup/mtr.autogen.00053.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=62 to /backup/mtr.autogen.00062.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=71 to /backup/mtr.autogen.00071.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=80 to /backup/mtr.autogen.00080.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=89 to /backup/mtr.autogen.00089.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=98 to /backup/mtr.autogen.00098.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=107 to /backup/mtr.autogen.00107.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=116 to /backup/mtr.autogen.00116.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:49 backing up db=mtr rp=autogen shard=125 to /backup/mtr.autogen.00125.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:50 backing up db=mtr rp=autogen shard=134 to /backup/mtr.autogen.00134.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:50 backing up db=mtr rp=autogen shard=143 to /backup/mtr.autogen.00143.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:50 backing up db=mtr rp=autogen shard=152 to /backup/mtr.autogen.00152.00 since 0001-01-01T00:00:00Z
2019/03/14 09:58:50 backup complete:
2019/03/14 09:58:50     /backup/20190314T085732Z.meta
2019/03/14 09:58:50     /backup/20190314T085732Z.s146.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s147.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s148.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s149.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s150.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s153.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s154.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s155.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s2.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s10.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s18.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s26.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s34.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s43.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s52.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s61.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s70.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s79.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s88.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s97.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s106.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s115.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s124.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s133.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s142.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s151.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s37.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s44.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s53.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s62.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s71.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s80.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s89.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s98.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s107.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s116.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s125.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s134.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s143.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.s152.tar.gz
2019/03/14 09:58:50     /backup/20190314T085732Z.manifest

real    1m18.625s
user    8m43.804s
sys    0m48.936s
Thu Mar 14 09:58:50 CET 2019: Finished Dump of all databases
Thu Mar 14 09:58:50 CET 2019: Backup script finished.

This time it worked!

At the end of the backup the following disk space was used:

$ df -h /var/lib/influxdb/
Filesystem               Type  Size  Used Avail Use% Mounted on
/dev/vglxc/inf-monix02-p ext4   50G   38G   13G  76% /

I ran the backup script a couple of times and the error mentioned at the begin didn't show up anymore. So InfluxDB seems to set and memorize the disk capacity at the start of the daemon. Unfortunately I was not able to find proof for this theory. Neither in the documentation nor using "show stats" or "show diagnostics".

The backup script can be found here: https://github.com/Napsty/scripts/blob/master/influxdb/backup-influxdb.sh


Add a comment

Show form to leave a comment

Comments (newest first)

No comments yet.