Influxdb: client got "no space left on device" for 1.4.2

Created on 30 May 2018 · 16Comments · Source: influxdata/influxdb

influxDB: 1.4.2
CentOS7

when send info to influxDB, I got messages like below. It gave me tips aoubt no space, but I checked on server, here are lots of spaces over there. Any ideas?

write influx error org.influxdb.InfluxDBException: engine: error writing WAL entry: write /var/lib/influxdb/wal/kafka/autogen/335/_00107.wal: no space left on device
at org.influxdb.InfluxDBException.buildExceptionFromErrorMessage(InfluxDBException.java:154)
at org.influxdb.InfluxDBException.buildExceptionForErrorState(InfluxDBException.java:166)
at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:584)
at org.influxdb.impl.InfluxDBImpl.write(InfluxDBImpl.java:355)

Directions

_GitHub Issues are reserved for actionable bug reports and feature requests._
_General questions should be sent to the InfluxDB Community Site._

_Before opening an issue, search for similar bug reports or feature requests on GitHub Issues._
_If no similar issue can be found, fill out either the "Bug Report" or the "Feature Request" section below.
_Erase the other section and everything on and above this line._

Bug report

__System info:__ [Include InfluxDB version, operating system name, and other relevant details]

__Steps to reproduce:__

[First Step]
[Second Step]
[and so on...]

__Expected behavior:__ [What you expected to happen]

__Actual behavior:__ [What actually happened]

__Additional info:__ [Include gist of relevant config, logs, etc.]

Also, if this is an issue of for performance, locking, etc the following commands are useful to create debug information for the team.

curl -o profiles.tar.gz "http://localhost:8086/debug/pprof/all?cpu=true"

curl -o vars.txt "http://localhost:8086/debug/vars"
iostat -xd 1 30 > iostat.txt

Please note It will take at least 30 seconds for the first cURL command above to return a response.
This is because it will run a CPU profile as part of its information gathering, which takes 30 seconds to collect.
Ideally you should run these commands when you're experiencing problems, so we can capture the state of the system at that time.

If you're concerned about running a CPU profile (which only has a small, temporary impact on performance), then you can set ?cpu=false or omit ?cpu=true altogether.

Please run those if possible and link them from a gist or simply attach them as a comment to the issue.

Please note, the quickest way to fix a bug is to open a Pull Request.

Feature Request

Opening a feature request kicks off a discussion.
Requests may be closed if we're not actively planning to work on them.

__Proposal:__ [Description of the feature]

__Current behavior:__ [What currently happens]

__Desired behavior:__ [What you would like to happen]

__Use case:__ [Why is this important (helps with prioritizing requests)]

1.x needs-backpor1.7

Source

superheizai

Most helpful comment

same issue on debian 8
400: {"error":"partial write: write /data/influxdb/data/_series/00/0000: no space left on device dropped=1"}

JamesGalt on 23 Aug 2018

👍3

All 16 comments

same issue on debian 8
400: {"error":"partial write: write /data/influxdb/data/_series/00/0000: no space left on device dropped=1"}

JamesGalt on 23 Aug 2018

👍3

got the same on debian 8.
Any workaround here?

utjc02 on 19 Sep 2018

Same here, on PC running Debian.
Disk was full, freed space, still getting errors like "error writing WAL entry: write /var/lib/influxdb/wal/kafka/autogen/335/_00107.wal: no space left on device"

A response would be nice.

godidog on 26 Sep 2018

I got the same problem:
Exception in thread "Thread-11" org.influxdb.InfluxDBException: engine: error writing WAL entry: write /home/influxdb/wal/s2/second/409/_00001.wal: file already closed at org.influxdb.InfluxDBException.buildExceptionFromErrorMessage(InfluxDBException.java:154) at org.influxdb.InfluxDBException.buildExceptionForErrorState(InfluxDBException.java:166) at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:609) at org.influxdb.impl.InfluxDBImpl.write(InfluxDBImpl.java:369) at org.influxdb.impl.InfluxDBImpl.write(InfluxDBImpl.java:382) at com.cnegroup.power.influxdb.InfluxDBConnection.batchInsert(InfluxDBConnection.java:201) at com.cnegroup.power.receiver.InfluxReceiverRunnable.saveRecordValue(InfluxReceiverRunnable.java:201) at com.cnegroup.power.receiver.InfluxReceiverRunnable.run(InfluxReceiverRunnable.java:79) at java.lang.Thread.run(Thread.java:748)
How to solve this problem ？
I find my Hard disk storage space is used 100% which installed influxDB, So expand my hard disk storage space, It began to work.

541211190 on 22 Oct 2018

👍2

same here, I got plenty of spaces on my disk, yet influxdb insisted that no space left on device

yifeikong on 30 Oct 2018

restart influxdb seems to solve the problem

yifeikong on 30 Oct 2018

The problem is that it empties the db cache so all points in cache are lost.

Le 30 oct. 2018 à 17:08, Yifei Kong notifications@github.com a écrit :

restart influxdb seems to solve the problem

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

JamesGalt on 30 Oct 2018

same issue on debian 8
400: {"error":"partial write: write /data/influxdb/data/_series/00/0000: no space left on device dropped=1"}
I also have the same question
post result：{"error":"partial write: write /data/device_statistics/_series/00/0000: no space left on device dropped=495"}

Cloud you please tell me how to solve the question

licunzhi on 28 Dec 2018

@JamesGalt Have you solved that question?
I want to know how to do to slove that

licunzhi on 28 Dec 2018

I'm experiencing the same on influxdb 1.6.1 over debian 8
Some clients are sending metrics with no issues while other client is getting the disk full response, while others are able to send these no problem.
I can see no relevant log on the influxdb side.

kali-brandwatch on 2 May 2019

For me as well. Debian 9.9 with influxdb 1.7.6.

Do you need more information?

bmg-iis on 9 May 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] on 7 Aug 2019

This issue has been automatically closed because it has not had recent activity. Please reopen if this issue is still important to you. Thank you for your contributions.

stale[bot] on 14 Aug 2019

I'm seeing this issue with v1.7.7. The problem is that once a disk error is encountered, the buffer writer of WAL cannot recover, even when the disk goes back to work.
The silent failure is a problem and it's better to escalate if we cannot handle it.

foobar on 23 Sep 2019

👍1

I saw this behaviour on Centos, influxdb-1.7.9-1.x86_64: Disk goes full -> wal write error. Clean up some disk space but influxdb does not recover. And not only does it not recover which is a bit annoying, but worse it constantly complains about the failure to syslog thereby rapidly filling up the disk again...

Restart of the service "fixed" the problem.

hlovdal on 26 Dec 2019

👍1

Same here on Debian GNU/Linux 9.13 (stretch) with InfluxDB 1.8.0. I just lost a week worth of data because I hadn't noticed there was a short period where the partition to which InfluxDB stores its data had run out of space. The partition recovered instantly as plenty of temporary space was freed, however InfluxDB kept complaining about no space and did not store any data persistently, yet responded to reading queries properly and did not return errors to the writing clients either.

I think this sort of behavior is fundamentally broken. A database should have persistence as primary goal. If data cannot be made persistent, writers should see exceptions by default. Also, e-mail alerts may have come in handy because I'm not permanently monitoring my syslog.