Presto: Optimizer exhausted the time limit

Created on 17 Nov 2017  路  18Comments  路  Source: prestodb/presto

In Presto 0.188 with iterative optimizer, we are observing "The optimizer exhausted the time limit" error for certain types of queries, which usually contain many operations (more than 2000 nodes) and table references.

We've also tried to increase experimental.iterative-optimizer-timeout from 3m to 10m, but it was not sufficient. Here is a copy of the stack trace:

Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: com.facebook.presto.spi.PrestoException: The optimizer exhausted the time limit of 180000 ms
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreNode(IterativeOptimizer.java:118)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:93)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreChildren(IterativeOptimizer.java:182)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.exploreGroup(IterativeOptimizer.java:95)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.iterative.IterativeOptimizer.optimize(IterativeOptimizer.java:84)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:133)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.sql.planner.LogicalPlanner.plan(LogicalPlanner.java:122)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.execution.SqlQueryExecution.doAnalyzeQuery(SqlQueryExecution.java:366)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.execution.SqlQueryExecution.analyzeQuery(SqlQueryExecution.java:345)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.facebook.presto.execution.SqlQueryExecution.start(SqlQueryExecution.java:277)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at com.treasure_data.presto.server.TDQueryExecution.start(TDQueryExecution.java:118)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
Nov 17 01:38:56 ip-172-18-132-23 presto[37566]: #011at java.lang.Thread.run(Thread.java:745)

We cannot expose this specific query (because this is our customer's query), but we are now trying to create an example that causes this long optimization time. By disabling iterative-optimizer, the query runs normally.

Most helpful comment

I confirmed that @kokosing #9705 has solved the issue!

All 18 comments

Can you please attach the profiller? The most likely IterativeOptimizer stuck in a cycle, so profiller log should point out which rules participate in cycle.

Just for test reason, can you please increase experimental.iterative-optimizer-timeout to something like an hour, to see if IterativeOptimizer ends eventually. That would mean there is no cycle, but there is something to optimize in optimizer.

@kokosing Yes. Some queries actually started by setting a longer timeout. So it's not a cycle issue. Our guess is excessive use of copyOf causes planning performance issue for large queries, or if a table has thousands of columns, even a simple select * from (table) query can cause this issue.

Yes, I have heard of this issue of excessive usage of copyOf.
I will try to reproduce this.

I played a bit with this and I did not found any easy solution, maybe it is because I had simple test cases. I saw couple of places which could be optimized, but it did not help much. Queries are planed within a seconds, it is not fast, but it is not a blocker.

@xerial Can you please play with https://github.com/prestodb/presto/pull/9376 in order to find a query which reproduces your issue?

FYI @Praveen2112

@xerial Could you post a query that causes this timeout in order to make it easier to reproduce?

Thanks for the PR. We also need to find a simpler query to reproduce this issue, but let me try.

@xerial, what do the JMX RuleStats counters say before and after the query runs? I.e., which rules contribute the most to execution time and which fire most often?

@martint I haven't checked that counter. Let me check it.

PickTableLayoutForPredicate looks dominant player. I suspect very large Hits or Time.Count

Name    Value   Type    Display Name    Update Interval Description
Failures    0   long    Failures    -1  
Hits    1098242 long    Hits    -1  
Time.Count  1098242.0   double  Time.Count  -1  
Time.Max    6291.455    double  Time.Max    -1  
Time.MaxError   0.004584599751238798    double  Time.MaxError   -1  
Time.Min    100.352 double  Time.Min    -1  
Time.P50    122.367 double  Time.P50    -1  
Time.P75    142.847 double  Time.P75    -1  
Time.P90    196.607 double  Time.P90    -1  
Time.P95    230.143 double  Time.P95    -1  
Time.P99    327.679 double  Time.P99    -1  
Time.Percentiles    java.util.LinkedHashMap, size 100   java.util.Map   Time.Percentiles    -1  
Time.Unit   MICROSECONDS    java.util.concurrent.TimeUnit   Time.Unit   -1  

Attaching EXPLAIN result with set session iterative_optimizer_enabled = false
explain.txt

@miniway are you diagnosing the same problem as reported by @xerial or just a similar one?
PickTableLayout (the class containing PickTableLayoutForPredicate) is accused of being infinite-loop-prone (https://github.com/prestodb/presto/pull/8735#issuecomment-355852977)

Looking at the current code of PickTableLayoutForPredicate there was a chance that rule could loop forever. Here is a fix: https://github.com/prestodb/presto/pull/9705

So it has to be different issue as @xerial said that increasing timeout helped.

Anyway @xerial and @miniway can you please guys somehow check if #9705 fixes your issue?

It鈥檚 the same problem that @xerial reported. I鈥檓 going to test the fix. Thanks

I confirmed that @kokosing #9705 has solved the issue!

https://github.com/prestodb/presto/pull/9705 is merged. Let me close this issue.

Thanks, I thought it was going to be closed automatically as I used magic work Fixes in the PR comment.

Was this page helpful?
0 / 5 - 0 ratings