Thanos: compact: failure on empty block

Created on 26 Feb 2019  路  4Comments  路  Source: thanos-io/thanos

Thanos, Prometheus and Golang version used

improbable/thanos:v0.3.1
quay.io/prometheus/prometheus:v2.7.1

What happened

Thanos compact is crash looping.

What you expected to happen

Compaction completes successfully.

How to reproduce it (as minimally and precisely as possible):

I'm not sure how the state of blocks in S3 was produced, but I think I understand what is happening.

Given a set of empty blocks to compact, tsdb's LeveledCompactor is resulting in an empty block: it marks all of the input blocks deletable, writes nothing to disk, and returns an empty ULID with a nil error.

// Compactor provides compaction against an underlying storage
// of time series data.
type Compactor interface {
    //...

    // Compact runs compaction against the provided directories. Must
    // only be called concurrently with results of Plan().
    // Can optionally pass a list of already open blocks,
    // to avoid having to reopen them.
    // When resulting Block has 0 samples
    //  * No block is written.
    //  * The source dirs are marked Deletable.
    //  * Returns empty ulid.ULID{}.
    Compact(dest string, dirs []string, open []*Block) (ulid.ULID, error)
}

Thanos is not handling this case and is exiting when the directory it expects to be named by the (empty) ULID doesn't exist.

error executing compaction: compaction failed: compaction: failed to finalize the block /data/compact/0@{prometheus="monitoring/default",prometheus_replica="prometheus-default-0"}/00000000000000000000000000: read new meta: open /data/compact/0@{prometheus="monitoring/default",prometheus_replica="prometheus-default-0"}/00000000000000000000000000/meta.json: no such file or directory

Full logs to relevant components

Logs

...
{"block":"01D4KE5VJX8MHRXKNXRT4QB1C2","caller":"compact.go:176","level":"debug","msg":"download meta","ts":"2019-02-26T01:21:38.582029611Z"}
{"block":"01D4KG1PJXBWFJ1SA7NP72TV8R","caller":"compact.go:176","level":"debug","msg":"download meta","ts":"2019-02-26T01:21:38.618553643Z"}
{"block":"01D4KG20QN7S9K0VMQPMFS7V2K","caller":"compact.go:176","level":"debug","msg":"download meta","ts":"2019-02-26T01:21:38.640079676Z"}
{"block":"01D4KN1JNSKEQR143DWGNHAKCY","caller":"compact.go:176","level":"debug","msg":"download meta","ts":"2019-02-26T01:21:38.676418268Z"}
{"block":"01D4KN1JNSKEQR143DWGNHAKCY","caller":"compact.go:194","level":"debug","msg":"block is too fresh for now","ts":"2019-02-26T01:21:38.694373934Z"}
{"block":"01D4KN1JTJFZH897Q1KB3187SC","caller":"compact.go:176","level":"debug","msg":"download meta","ts":"2019-02-26T01:21:38.694406388Z"}
{"block":"01D4KN1JTJFZH897Q1KB3187SC","caller":"compact.go:194","level":"debug","msg":"block is too fresh for now","ts":"2019-02-26T01:21:38.789104732Z"}
{"caller":"compact.go:827","level":"info","msg":"start of GC","ts":"2019-02-26T01:21:38.789165195Z"}
{"blocks":"[/data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QR3YED74WDJZ8ZV5YW1P /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QSWTA6EH0SN6R9V36270 /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QTRA6B16JPXSBV9SVNKK /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QVJ3P7QRYN1J3V89DFHA]","caller":"compact.go:721","compactionGroup":"0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}","duration":"580.824228ms","level":"debug","msg":"downloaded and verified blocks","ts":"2019-02-26T01:21:39.465482905Z"}
{"caller":"compact.go:384","count":4,"duration":"31.131698ms","level":"info","msg":"compact blocks resulted in empty block","sources":"[01D3M2QR3YED74WDJZ8ZV5YW1P 01D3M2QSWTA6EH0SN6R9V36270 01D3M2QTRA6B16JPXSBV9SVNKK 01D3M2QVJ3P7QRYN1J3V89DFHA]","ts":"2019-02-26T01:21:39.496691527Z"}
{"blocks":"[/data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QR3YED74WDJZ8ZV5YW1P /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QSWTA6EH0SN6R9V36270 /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QTRA6B16JPXSBV9SVNKK /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/01D3M2QVJ3P7QRYN1J3V89DFHA]","caller":"compact.go:730","compactionGroup":"0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}","duration":"31.254751ms","level":"debug","msg":"compacted blocks","ts":"2019-02-26T01:21:39.496812357Z"}
{"caller":"main.go:181","err":"error executing compaction: compaction failed: compaction: failed to finalize the block /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/00000000000000000000000000: read new meta: open /data/compact/0@{prometheus=\"monitoring/default\",prometheus_replica=\"prometheus-default-0\"}/00000000000000000000000000/meta.json: no such file or directory","level":"error","msg":"running command failed","ts":"2019-02-26T01:21:39.496912774Z"}

Most helpful comment

Closing as this should be fixed by #904. Please shout if I am wrong :)

All 4 comments

I've looked up the blocks from the log in S3 and none of them have any chunks, just small (~400B) meta.json and (~150KB) index files.

I purged all the empty blocks in my S3 bucket (after backing them up in case they would be useful) and now compact is happy again.

Yeah we had to create a script to remove all empty blocks from the bucket, it worked after that, I mean another issue popped up out-of-order label set ^^

Closing as this should be fixed by #904. Please shout if I am wrong :)

Was this page helpful?
0 / 5 - 0 ratings