level=info ts=2019-04-22T12:40:23.651280397Z caller=downsample.go:239 msg="downsampled block" from=01D9284J4BDB9SV9KV74ADMP32 to=01D92GS3GDTH2Q4WRJ46T22HMA duration=3m33.154079955s
level=error ts=2019-04-22T12:40:35.549001957Z caller=main.go:182 msg="running command failed" err="downsampling failed: downsampling to 5 min: upload downsampled block 01D92GS3GDTH2Q4WRJ46T22HMA: read meta: open data/01D92GS3GDTH2Q4WRJ46T22HMA/meta.json: no such file or directory"
sh-4.2$ cd data
sh-4.2$ ls -laR
.:
total 0
drwxr-xr-x. 4 1000620000 root 74 Apr 22 15:36 .
drwxrwxrwt. 1 root root 18 Apr 22 15:36 ..
drwxr-xr-x. 3 1000620000 root 74 Apr 22 15:36 01D9284J4BDB9SV9KV74ADMP32
drwxr-xr-x. 3 1000620000 root 33 Apr 22 15:36 01D92GS3GDTH2Q4WRJ46T22HMA
./01D9284J4BDB9SV9KV74ADMP32:
total 1484752
drwxr-xr-x. 3 1000620000 root 74 Apr 22 15:36 .
drwxr-xr-x. 4 1000620000 root 74 Apr 22 15:36 ..
drwxr-xr-x. 2 1000620000 root 34 Apr 22 15:36 chunks
-rw-r--r--. 1 1000620000 root 1408701585 Apr 22 15:36 index
-rw-r--r--. 1 1000620000 root 111669428 Apr 22 15:36 index.cache.json
-rw-r--r--. 1 1000620000 root 4140 Apr 22 15:36 meta.json
./01D9284J4BDB9SV9KV74ADMP32/chunks:
total 590160
drwxr-xr-x. 2 1000620000 root 34 Apr 22 15:36 .
drwxr-xr-x. 3 1000620000 root 74 Apr 22 15:36 ..
-rw-r--r--. 1 1000620000 root 536866757 Apr 22 15:36 000001
-rw-r--r--. 1 1000620000 root 67455797 Apr 22 15:36 000002
./01D92GS3GDTH2Q4WRJ46T22HMA:
total 1351092
drwxr-xr-x. 3 1000620000 root 33 Apr 22 15:36 .
drwxr-xr-x. 4 1000620000 root 74 Apr 22 15:36 ..
drwxr-xr-x. 2 1000620000 root 34 Apr 22 15:38 chunks
-rw-r--r--. 1 1000620000 root 1383518160 Apr 22 15:40 index
./01D92GS3GDTH2Q4WRJ46T22HMA/chunks:
total 858888
drwxr-xr-x. 2 1000620000 root 34 Apr 22 15:38 .
drwxr-xr-x. 3 1000620000 root 33 Apr 22 15:36 ..
-rw-r--r--. 1 1000620000 root 536870901 Apr 22 15:38 000001
-rw-r--r--. 1 1000620000 root 342628705 Apr 22 15:40 000002
Have the same issue, after downsample it fails to upload meta.json, as that file doesn't exist
Log output just FYI:
level=info ts=2019-04-20T03:53:19.870896186Z caller=compact.go:244 msg="start first pass of downsampling"
level=info ts=2019-04-20T03:53:40.010226063Z caller=downsample.go:212 msg="downloaded block" id=01D8WCV13QC3KEWC8Y8EPTETJK duration=20.091231512s
level=info ts=2019-04-20T03:54:59.682436397Z caller=downsample.go:239 msg="downsampled block" from=01D8WCV13QC3KEWC8Y8EPTETJK to=01D8WE1P4W7CT11ZJ4ZVX9G2WR duration=1m18.935265238s
level=error ts=2019-04-20T03:55:00.054814444Z caller=main.go:182 msg="running command failed" err="error executing compaction: first pass of downsampling failed: downsampling to 5 min: upload downsampled block 01D8WE1P4W7CT11ZJ4ZVX9G2WR: read meta: open /
var/thanos/compact/downsample/01D8WE1P4W7CT11ZJ4ZVX9G2WR/meta.json: no such file or directory"
Hi, Thanks for the report.
What Thanos version?
Is this error repeatable?
@bwplotka I have added an additional check in downsampling test, that checks for meta.json file. Latest RC/master has this issue
in general it looks like this line is not executed https://github.com/improbable-eng/thanos/blob/master/pkg/compact/downsample/streamed_block_writer.go#L202
version 0.3.2 successfully downsampled and uploaded data, so it's master/RC regression
It's safe to rollback to compactor 0.3.2 AFAIK, will investigate more. Nothing obvious so far.
The only change that relates to this part (indirectly) is https://github.com/improbable-eng/thanos/pull/986 cc @xjewer , will dig more after a meeting.
Also it's reproducible on our setups as well, found it just today after holiday break as compactor alerts are alerting only in-office hours. (:
The actuall error was masked because of bug in CloseWithErrCapture err pointer handling bug. Fixing all now. The underlying error (at least locally) is unexpected error: close closers: write index cache: write index cache: open mmap index file /tmp/downsample-raw264203102/01D950XM8F32AFTHMZ6DJHTZM9/index: mmap: invalid argument
https://github.com/improbable-eng/thanos/pull/986 introduced bug, particulary this: https://github.com/xjewer/thanos/blob/5c32ed52adb10aa2c158616a74b4936ccc308f70/pkg/compact/downsample/streamed_block_writer.go#L198
Looking why mmap is failing. No index file available on this path probably.
Index size is 0...
Fix: https://github.com/improbable-eng/thanos/pull/1070
Please help in reviewing
The same issue with the latest Thanos version. v0.4.0-rc.0
Build from master work.
yes ,rc.0 have it. We are releasing rc.1 today with all the fixes
rc.1 helps