Influxdb: Log intermediate full compactions as level 4 compactions

Created on 1 Mar 2018  路  5Comments  路  Source: influxdata/influxdb

There is some confusion in the logs related to full compactions. Full compactions run as needed and show up in the logs like (pre 1.5):

[I] 2018-02-28T22:36:57Z beginning full compaction of group 0, 2 TSM files engine=tsm1
[I] 2018-02-28T22:36:57Z compacting full group (0) /Users/jason/.influxdb/data/_internal/monitor/1/000000002-000000002.tsm (#0) engine=tsm1
[I] 2018-02-28T22:36:57Z compacting full group (0) /Users/jason/.influxdb/data/_internal/monitor/1/000000003-000000001.tsm (#1) engine=tsm1
[I] 2018-02-28T22:36:57Z compacted full group (0) into /Users/jason/.influxdb/data/_internal/monitor/1/000000003-000000002.tsm.tmp (#0) engine=tsm1
[I] 2018-02-28T22:36:57Z compacted full 2 files into 1 files in 8.552039ms engine=tsm1

There is also a full compaction that runs after the compact-full-write-cold-duration threshold has elapsed. The main different between the full compaction above and this final one is the scope of files included in the compaction. The prior one is smaller and limited to 4 files at a time. The final one is the full shard.

Since these are both logged as full compactions, it can be confusing as some users expect there to be only one full compaction that occurs after the compact-full-write-cold-duration.

To remedy this, we should switch the logging for the intermediate full compactions to log as level 4 compactions.

1.x preview proposed wontfix

Most helpful comment

@ono760 shared with me a customer's confusion about misunderstanding what the compact-full-write-cold-duration setting does and what they read in the description (config file/docs). They suggested clarification in the docs. Talking to @jwilder, it was clear that the customer confusion would be much less likely with his proposed change. @stuartcarnie agreed too with me that this will make it easier for customers to monitor full compactions properly.

All 5 comments

@ono760 shared with me a customer's confusion about misunderstanding what the compact-full-write-cold-duration setting does and what they read in the description (config file/docs). They suggested clarification in the docs. Talking to @jwilder, it was clear that the customer confusion would be much less likely with his proposed change. @stuartcarnie agreed too with me that this will make it easier for customers to monitor full compactions properly.

If after compact-full-write-cold-duration has elapsed and the shard has been fully compacted, and then there are new writes to this shard, what type of compaction is performed? Is this a full compaction, or a level 4 compaction? This was the confusion referred to by @stevebang.

@breckcs In that case, the shard would have the existing fully compacted data and one or more newer files in various levels. Because new writes have arrived, the shard would become hot again and the the compaction planner would run and determine what compactions to run (if necessary). As more data is written, the shard would accumulate more level 1,2,3 TSM files which would be compacted as needed. After the compact-full-write-cold-duration passes from the last change to the shard, the shard would be fully compacted again because it would now be considered cold again.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

This issue has been automatically closed because it has not had recent activity. Please reopen if this issue is still important to you. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings