Acceptance criteria:
--> https://github.com/improbable-eng/thanos/issues/942
Initial ideas:
Extra option mentioned below: add --max-time and --min-time to store & compactor to "shard" those within time.
CC @claytono @tdabasinskas @xjewer
Another option @antonio and I have discussed is adding a --mintime and --maxtime flag to thanos store and compacter. If the flag was given, then they would each ignore blocks outside of the time range given, allowing you to run multiple thanos store and compactor components against a single bucket, but also easily repartition by just selecting different time ranges.
I think we should add this https://github.com/improbable-eng/thanos/issues/335 to the list.
Is there any update?
I've started work a patch for the--min-time and --max-time functionality. I've got it working for the store code, and I hope to start working on the compactor piece soon.
Help wanted for other stuff.
We also likely fixed: https://github.com/improbable-eng/thanos/issues/335 on master, but tests are pending by @GiedriusS (:
@claytono cool :+1:
@claytono cool, Can you submit the code about the store first? That's what we need...
I have a lot of large buckets, many of the index.cache.json files are ~100MB.
One idea that came to mind was to use FlatBuffers.
@claytono is there any update? It would be nice to solve this in a general way as we discussed here.
I'm hoping to get a PR up for this this week if time allows. For now, my PR only addresses partitioning on the thanos-store side of things. It's not clear to me if there really needs to be similar limiting on the compactor side of things or not. We're planning to do an initial deployment without compactor support for time ranges.
Another option @antonio and I have discussed is adding a --mintime and --maxtime
We talk about this as well in @povilasv PR:https://github.com/improbable-eng/thanos/pull/930
I just tried 0.3.2 on Tuesday, it didn't work for my large buckets in s3. I have 37 prometheus clusters (currently), 9TB of data total, largest bucket is around 700GB. I reverted to 0.2.1 and things are back to normal. High latency and query timeouts were the issues I was seeing. I am running prometheus 2.4.3, not sure if that might have been contributing to the issue.
Do you guys think this work will help towards that end? Thanks for the great work 馃槃
@midnightconman have you read the change log? Most likely you need to increase your index cache size (:
@midnightconman have you read the change log? Most likely you need to increase your index cache size (:
I did 馃槃
I tried settings of --index-cache-size=20GB and --chunk-pool-size=200GB, no change. Strangely the disk usage for 0.2.1 and 0.3.2 in /data is the same?
I am not talking about slower queries, like 0.2.1 is 200ms and 0.3.2 is 1000ms... 0.3.2 queries never return for larger buckets.
Could we have multiple store gateways divide the load between themselves? Ideally I would picture 3 node gateways pointing to a single bucket and they each handle a third of the chunks divided over the whole time period (e.g. all have some newer and older chunks). If another one is added then it would work out a new way to divide it. It would do the same should one disappear. I think this would be nicer then having the user work out the times to set to ensure they match the chunks and it could also prevent the store gateway with the newest chunks doing most work while the ones with older chunks do little.
@baelish That seems ideal. The manual time range partitioning was mostly proposed as something that would be fairly simple to implement and start using quickly. I would guess the issues with doing that would be coordination between them, and the need to publish consistent time ranges. With the latter, I think the issue is that currently stores publish just a mintime and max time, so if you want to have just queries routed to a store that definitely had the blocks, you'd want to make sure the store had a contiguous range of blocks, or change the way they're published such that they can publish multiple time ranges.
@claytono makes sense, sometimes you need to get things out there quick. Perhaps it could be considered a long term goal.
Thanks, everyone involved! :heart:
We have now time partitioning and block by external labels sharding as requested in this ticket so we can close this!
For further improvements and ideas tracking issue please see: https://github.com/thanos-io/thanos/issues/1705
Happy Halloween!
Most helpful comment
Another option @antonio and I have discussed is adding a
--mintimeand--maxtimeflag to thanos store and compacter. If the flag was given, then they would each ignore blocks outside of the time range given, allowing you to run multiple thanos store and compactor components against a single bucket, but also easily repartition by just selecting different time ranges.