Presto: Cannot query table with partition created by TEZ

Created on 23 Mar 2018  ·  13Comments  ·  Source: prestodb/presto

Hi, we found that if use hive + tez , data file will be created under the sub-directory of a partition directory, and presto cannot read the data under the sub-directory while hive can, which will caused data missing during query.

That is , presto can get data correctly with partition like this:

"/user/hive/warehouse/test.db/table1/bucketid=2017-11-20/00000_0",

__select count(*) from xxx where bucketid='2017-11-20'__ returns 100

but cannot work with partition like this

"/user/hive/warehouse/test.db/table1/bucketid=2017-11-20/1/00000_0".

__select count(*) from xxx where bucketid='2017-11-20'__ returns 0

And hive works fine with both struct of partition directory.
Is this a bug or there's a configuration enable the feature?

Thanks a lot!

Most helpful comment

One of the situations, when Hive+Tez will create several subfolders is when UNION is used.
(As I understand) It goes through specific optimization and will write a folder per query in UNION, if UNION(s) is the last operations before writing into the table.

For example, the following query may create at least 2 subfolders under 'year=2018/month=3' partition:

 INSERT OVERWRITE TABLE test_table
 PARTITION(year = 2018, month = 3)
 SELECT *
 FROM
 (
     SELECT * FROM a
     UNION ALL
     SELECT * FROM b
 ) out;

This is completely separate from ACID tables, and any kind of shuffle operation can be used as a workaround (e.g. 'ORDER BY ' ). Though it could be a quite expensive workaround.

Here are some additional information and examples:

p.s. If @linwukang is ok with it, I can try to come up with some structured information for the documentation.

All 13 comments

It looks like this is using ORC ACID, which Presto does not support. There should be an option to disable that in Tez.

@electrum Thanks for your reply, I will check it soon.

Hi, @electrum. As we has checked the orc acid feature, we believe that it is not caused by ORC ACID.

With hive
we use 'insert overwrite', which is not allowed on tables with ORC ACID feature on, to import data to that table. As we're not familiar with TEZ, we guess that in some situation other than ORC ACID, TEZ will store the data in the sub-directory under the bucket root.
So, should we support the data directory struct in presto for query?

After reading the source code of presto, I finally found that it is a feature already exists in the presto. By adding hive.recursive-directories=true to the catalog/hive.properties, presto read the data under the sub-directory.
I didn't find any information of this feature in the official document,:-(

goole the keyword "hive.recursive-directories", I find this, (-.-):
Investigate whether hive.recursive-directories should be enabled by default #7873

@linwukang good find! maybe you'd like to create a PR adding the missing doc?

My memory of Hive ACID tables, is that the tables are bucketed, and the inserts and deletes are placed in sub directories. Each directory is named for a transaction and you have to check with the metastore to know which ones are committed. Also, the deletes are a special format where they reference row IDs that should be skipped in other files. Basically, this means you need special logic to process these tables and hive.recursive-directories=true will likely result in incorrect results or failures.

One of the situations, when Hive+Tez will create several subfolders is when UNION is used.
(As I understand) It goes through specific optimization and will write a folder per query in UNION, if UNION(s) is the last operations before writing into the table.

For example, the following query may create at least 2 subfolders under 'year=2018/month=3' partition:

 INSERT OVERWRITE TABLE test_table
 PARTITION(year = 2018, month = 3)
 SELECT *
 FROM
 (
     SELECT * FROM a
     UNION ALL
     SELECT * FROM b
 ) out;

This is completely separate from ACID tables, and any kind of shuffle operation can be used as a workaround (e.g. 'ORDER BY ' ). Though it could be a quite expensive workaround.

Here are some additional information and examples:

p.s. If @linwukang is ok with it, I can try to come up with some structured information for the documentation.

@rchukh, actually that's the situation we're on. We do use UNION in the sub-query statement. The workaround by adding a shuffle operation is really expensive for us. As we do not have any Hive ACID tables, enabling hive.recursive-directories may be a simpler and more efficient solution for us :-).

I was preparing a PR which adding a simple information about the configuration 'hive.recursive-directories' in the connector/hive section, as @findepi suggested. It is a good idea if you can add more detail information about this issue to the documents 👍 .

@dain, it is true that enabling hive.recursive-directories will cause incorrect results on query with Hive ACID tables. I think that's why hive.recursive-directories is set to false as default. May be we can make the configuration as a session property, so users like us can have more selectivity.

@dain @linwukang would it make sense to detect whether table is Hive ACID table and if it isn't, turn on hive.recursive-directories automatically? Would it be at all possible?

@findepi It's a good idea! I will do a deeper investigation on this issue.

@dain @linwukang would it make sense to detect whether table is Hive ACID table and if it isn't, turn on hive.recursive-directories automatically? Would it be at all possible?

Or raise an error when hive.recursive-directories is turned on and table is ACID like.

It sounds like we should automatically disable recursive-directories for ACID tables (e.g. table with property "transactional"="true"). Also, if normal tables in the latest hive expect us to process subdirectories we should consider changing the recursive-directories default to true.

Was this page helpful?
0 / 5 - 0 ratings