Presto: DynamicFiltering throws prestoException due to Invalid partition value 'N'

Created on 17 Nov 2020  路  5Comments  路  Source: prestodb/presto

When using Hive connector to fetch data that has null value in partitioning key, query fails as follows

com.facebook.presto.spi.PrestoException: Invalid partition value '\N' for BIGINT partition key: ss_sold_date_sk at com.facebook.presto.hive.HiveUtil.bigintPartitionKey(HiveUtil.java:710) at com.facebook.presto.hive.HiveUtil.parsePartitionValue(HiveUtil.java:596) at com.facebook.presto.hive.HivePageSourceProvider.shouldSkipPartition(HivePageSourceProvider.java:547) at com.facebook.presto.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:172) at com.facebook.presto.spi.connector.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:63) at com.facebook.presto.split.PageSourceManager.createPageSource(PageSourceManager.java:80) at com.facebook.presto.operator.ScanFilterAndProjectOperator.getOutput(ScanFilterAndProjectOperator.java:231) at com.facebook.presto.operator.Driver.processInternal(Driver.java:417) at com.facebook.presto.operator.Driver.lambda$processFor$8(Driver.java:300) at com.facebook.presto.operator.Driver.tryWithLock(Driver.java:721) at com.facebook.presto.operator.Driver.processFor(Driver.java:293) at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1077) at com.facebook.presto.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:162) at com.facebook.presto.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:545) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

bug

All 5 comments

there's glitch in handling partition pruning logic when hive partitioning key contains a null value. I can try to fix it

cc: @kewang1024

there's glitch in handling partition pruning logic when hive partitioning key contains a null value. I can try to fix it

Thanks for finding the issue, yeah, it would be great if you can help work on it, you can add me as the reviewer when you have the fix, thanks!

Can you put the failed example query here and tables definition if they're not tpch tables

Bug is fixed. Recreate step added below:

DDL:
CREATE TABLE t3(c2 bigint, c1 bigint)
CREATE TABLE t4(c2 bigint, c1 bigint) with(partitioned_by=array['c1'])
INSERT INTO t3 VALUES(null, 2)", 1)
INSERT INTO t4 VALUES(null, null), (2,2)", 2)

Failing query:
select * from t3, t4 where t3.c1=t4.c2

Was this page helpful?
0 / 5 - 0 ratings