Thanos: receive: high cpu when upgrading from 0.12.2 with old data

Created on 23 Jun 2020  路  18Comments  路  Source: thanos-io/thanos

Thanos, Prometheus and Golang version used:
thanos, version 0.13.0 (branch: HEAD, revision: adf6facb8d6bf44097aae084ec091ac3febd9eb8)
build user: circleci@b8cd18f8b553
build date: 20200622-10:04:50
go version: go1.14.2

Object Storage Provider:
GCP

What happened:
We're using 3 thanos-receive v0.12.2 pods running with --receive.replication-factor=3.
At 3:30 i've restarted one pod (green line) as v0.13.0, and it's cpu usage doubled:
image
Memory usage is 5-10% higher, which is fine.
Here is another graph, from node where pod has been running:
image

What you expected to happen:
Statistically negligible resource usage change between v0.12.2 and v0.13.0 for receive, as for other thanos components.

How to reproduce it (as minimally and precisely as possible):
We run receive with such args:

          args:
          - receive
          - |
            --objstore.config=type: GCS
            config:
              bucket: jb-thanos
          - --tsdb.path=/data
          - --tsdb.retention=1d
          - --tsdb.min-block-duration=30m  # have to be >2x(prometheus side 15m), https://github.com/thanos-io/thanos/issues/2114
          - --tsdb.max-block-duration=30m
          - --label=replica="$(NAME)"
          - --receive.local-endpoint=$(NAME).thanos-receive.monitoring.svc.cluster.local:10901
          - --receive.hashrings-file=/cfg/hashrings.json
          - --receive.replication-factor=3

Full logs to relevant components:
These are new events, which were not in v0.12.2. Also, they written each ~15sec comparing to ~15min in prometheus v0.19.0 with almost the same settings for tsdb:

Logs

level=info ts=2020-06-23T00:37:15.498358444Z caller=head.go:662 component=receive tenant=default-tenant component=tsdb msg="Head GC completed" duration=308.339725ms
level=info ts=2020-06-23T00:37:22.528425096Z caller=head.go:662 component=receive tenant=default-tenant component=tsdb msg="Head GC completed" duration=287.226708ms
level=info ts=2020-06-23T00:37:25.321693858Z caller=head.go:734 component=receive tenant=default-tenant component=tsdb msg="WAL checkpoint complete" first=51 last=52 duration=2.793211656s
level=info ts=2020-06-23T00:37:34.117011112Z caller=head.go:662 component=receive tenant=default-tenant component=tsdb msg="Head GC completed" duration=305.063766ms
level=info ts=2020-06-23T00:37:42.249223736Z caller=head.go:662 component=receive tenant=default-tenant component=tsdb msg="Head GC completed" duration=353.091759ms
level=info ts=2020-06-23T00:37:45.317240183Z caller=head.go:734 component=receive tenant=default-tenant component=tsdb msg="WAL checkpoint complete" first=53 last=54 duration=3.067957396s

Anything else we need to know:

bug help wanted stale

Most helpful comment

We can repro this on our side, looking into this more (:

All 18 comments

could you take a cpu profile of this?

Please find 30s cpu profiles attached, they were taken almost at the same time:
Archive.zip

Thanks, looks like GC is the current bottleneck:

DIFF 0.12.2 vs 0.13.0:
profile001.pdf

We can repro this on our side, looking into this more (:

JFYI,
i've backported https://github.com/thanos-io/thanos/pull/2621, https://github.com/thanos-io/thanos/pull/2679 to v0.12.2 and it does not lead to cpu usage increase.

It's not particularly in v0.13.0, but this code does not help for sure: https://github.com/prometheus/prometheus/blob/b788986717e1597452ca25e5219510bb787165c7/tsdb/db.go#L812

TSDB we use, forces GC multiple times. It might make sense for single TSDB, but not really for multiple.

they written each ~15sec comparing to ~15min in prometheus v2.19.0 with almost the same settings for tsdb:

Turns out this high cpu usage happens when i'm start thanos-receive v0.13.0 on non-empty tsdb folder from v0.12.2 (which has many subfolders like 01EC0YX4APXB1MGA39DWFTS96C, etc)
This leads to each subfolder being converted to separate tsdb:

...
prometheus_tsdb_head_samples_appended_total{app="thanos-receive",...,statefulset_kubernetes_io_pod_name="thanos-receive-1",tenant="01EC1KGAFNHYWZ6Z8CPMZ8TQQ6"}
prometheus_tsdb_head_samples_appended_total{app="thanos-receive",...,statefulset_kubernetes_io_pod_name="thanos-receive-1",tenant="01EC1HSC4SJ9KSZRYENXX9PS18"}
prometheus_tsdb_head_samples_appended_total{app="thanos-receive",...,statefulset_kubernetes_io_pod_name="thanos-receive-1",tenant="01EC1G2E84CNBJ3NFY7P4PE5RW"}
prometheus_tsdb_head_samples_appended_total{app="thanos-receive",...,statefulset_kubernetes_io_pod_name="thanos-receive-1",tenant="01EC1EBGTBP4HRW2WGG9KEXN7P"}
prometheus_tsdb_head_samples_appended_total{app="thanos-receive",...,statefulset_kubernetes_io_pod_name="thanos-receive-1",tenant="01EC1CMKR2521YKPS1283WGBKJ"}
prometheus_tsdb_head_samples_appended_total{app="thanos-receive",...,statefulset_kubernetes_io_pod_name="thanos-receive-1",tenant="01EC1AXMY36Z98AQ64QQRD10MP"}
prometheus_tsdb_head_samples_appended_total{app="thanos-receive",...,statefulset_kubernetes_io_pod_name="thanos-receive-1",tenant="01EC196PXDRFDA39FW476KR7SJ"}
prometheus_tsdb_head_samples_appended_total{app="thanos-receive",...,statefulset_kubernetes_io_pod_name="thanos-receive-1",tenant="01EC17FSYJY8AWTYXCP3QW9J9W"}
...

And then log is full of component=tsdb msg="Head GC completed" events.
If I wipe tsdb folder before starting of v0.13.0 - then it consumes the same cpu as v0.12.2

This looks like something we experienced as well, but not when I recall this we wiped the dir indeed.

It's migration issue from non-multiTSDB to multiTSDB.

This can be easily repro in unit test when porting old receive. Let's fix it, might be annoying for Receive users who want to upgrade to 0.13+

I think for you you can ensure it works with higher CPU long enough for WAL from 0.12 to be non-existent then wipe those dirs. In the mean time we will look on fix. Still help wanted (:

Those look like the migration didn't work correctly. The dirs look like they are block directories. Just to make sure though, you are not running a multi tenant setup right?

Yes, we're not running multi tenancy, and these are just block dirs.

Hello 馃憢 Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 馃
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

Closing for now as promised, let us know if you need this to be reopened! 馃

Still the migration path has to be fixed in theory.

Hello 馃憢 Looks like there was no activity on this issue for last 30 days.
Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 馃
If there will be no activity for next week, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.

Closing for now as promised, let us know if you need this to be reopened! 馃

Was this page helpful?
0 / 5 - 0 ratings