Presto: Query Errors out on Hive Bucket table

Created on 20 Jul 2017  Â·  22Comments  Â·  Source: prestodb/presto

We have a table with Hive partitions and then "bucketing" on the second column. The queries work fine in Hive, but when we try to access it in Presto, it errors out saying the "hive table is corrupt". This is the sample of what we are trying to do -

create table TEST_HIVE_BUCKETS (
COL1 string,
COL2 string
)
PARTITIONED BY (
  eff_dt timestamp)
  CLUSTERED BY( COL2) INTO 50 BUCKETS;

Load data for atleast 2-3 days. When we run the queries

presto:default> select * from test_hive_buckets;
Query 20170720_145352_00039_m57j6 failed: Hive table is corrupt. It is declared as being bucketed, but the files do not match the bucketing declaration. The number of files in the directory (50) does not match the declared bucket count (5) for partition: eff_dt=2016-12-02 00%3A00%3A00

Looks like the number of files under the folder should exactly match with the number of buckets on the table definition. Is this the case?

Most helpful comment

In case it helps, the latest version from Starburst has the code fix for
the above scenario.
Check out the release notes for more details -
https://docs.starburstdata.com/latest/release/release-0.195-e.html

On Mon, Apr 2, 2018 at 8:11 AM, rabinnh notifications@github.com wrote:

I've been working on this issue for a week, and have posted and received
some answers in both the Hive and Presto mailing-list/forums. Here's what I
have found out:

  • AFAIK, there is no way to force Hive with the Tez engine (and mr is
    deprecated) to create all the buckets. So no help from Hive.
  • There is no way to force Presto to continue when not all the buckets
    are present (bucket number does not match metastore). So no help from
    Presto.

So we would seem to be at an impasse. If any partition in the table
doesn't have a full complement of buckets, a Presto query will fail. I can
insert with the mr engine, but it is much slower and there is no way to
tell if it will continue to be supported, since it is deprecated. We run
into this in our test environment, but we cannot confidently deploy to
production because we can't chance that any particular hour will result in
fewer buckets in a partition. This makes Presto too fragile for our use
when using buckets.

The only "solution" that I can think of is to do away with bucketing
completely and to depend solely on ORC indexes, which seems like a pretty
sub-standard solution.

My guess is that Facebook would almost never run into the issue, because
there's likely so much data that all buckets are always created, even with
Tez.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/prestodb/presto/issues/8557#issuecomment-377937400,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AQhEuKjUexVWM7OCbxkoVsyNJE4pLEGTks5tkjF7gaJpZM4OeOth
.

--
Sajumon Joseph
303-378-0593
[email protected]

All 22 comments

This means the table has a declared bucket count of 5, not 50, which is strange given the above table definition. Can you show the output of DESCRIBE FORMATTED test_hive_buckets as run in Hive?

This is the output -

hive> DESCRIBE FORMATTED test_hive_buckets;
OK
# col_name              data_type               comment

col1                    string
col2                    string

# Partition Information
# col_name              data_type               comment

eff_dt                  timestamp

# Detailed Table Information
Database:               default
Owner:                  root
CreateTime:             Thu Jul 20 14:49:16 UTC 2017
LastAccessTime:         UNKNOWN
Retention:              0
Location:               hdfs://ip-<>.com:8020/user/hive/warehouse/test_hive_buckets
Table Type:             MANAGED_TABLE
Table Parameters:
        transient_lastDdlTime   1500562156

# Storage Information
SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat:            org.apache.hadoop.mapred.TextInputFormat
OutputFormat:           org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed:             No
Num Buckets:            50
Bucket Columns:         [col2]
Sort Columns:           []
Storage Desc Params:
        serialization.format    1
Time taken: 0.86 seconds, Fetched: 31 row(s)

Hive is forgiving your mistake and allowing the query to run while presto is not. You declared your table to have 50 buckets, but you ended up with only 5 (it can happen due to users manually deleting the files without updating the hive metastore or you loaded data indirectly into the physical location referred by the hive table or multiple inserts into a bucketed table).

Teradata folks tried to fix such conditions in the past I believe.
Following are the properties they added to make presto work.

hive.multi-file-bucketing.enabled=true
hive.empty-bucketed-partitions.enabled=true

You can find more here - https://github.com/prestodb/presto/pull/6282

Thank you. The files were created by Hive without any manual intervention, this might happen (and i think its a valid scenario) when Hive is not able to split the input data into that many number of buckets. Ex: We might get just one record for a day (in Datawarehouse terms), so there will be one bucket/file for that day. The properties you mentioned, do they exist today? I get errors when i try to use these properties:

2017-07-26T13:09:55.663Z        WARN    main    Bootstrap       UNUSED PROPERTIES
2017-07-26T13:09:55.663Z        WARN    main    Bootstrap       hive.empty-bucketed-partitions.enabled=true
2017-07-26T13:09:55.663Z        WARN    main    Bootstrap       hive.multi-file-bucketing.enabled=true
2017-07-26T13:09:55.663Z        WARN    main    Bootstrap
2017-07-26T13:09:55.838Z        ERROR   main    com.facebook.presto.server.PrestoServer Unable to create injector, see the following errors:

1) Configuration property 'hive.empty-bucketed-partitions.enabled=true' was not used
  at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:235)

2) Configuration property 'hive.multi-file-bucketing.enabled=true' was not used
  at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:235)

2 errors
com.google.inject.CreationException: Unable to create injector, see the following errors:

1) Configuration property 'hive.empty-bucketed-partitions.enabled=true' was not used
  at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:235)

2) Configuration property 'hive.multi-file-bucketing.enabled=true' was not used
  at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:235)

2 errors
        at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:466)
        at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:155)
        at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:107)
        at com.google.inject.Guice.createInjector(Guice.java:96)
        at io.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:242)
        at com.facebook.presto.hive.HiveConnectorFactory.create(HiveConnectorFactory.java:107)
        at com.facebook.presto.connector.ConnectorManager.createConnector(ConnectorManager.java:304)
        at com.facebook.presto.connector.ConnectorManager.addCatalogConnector(ConnectorManager.java:193)
        at com.facebook.presto.connector.ConnectorManager.createConnection(ConnectorManager.java:185)

You will have to use Teradata's version of presto to have these additional
properties working.
http://www.teradata.com/products-and-services/Presto/Presto-Download

On Wed, Jul 26, 2017 at 7:12 AM, Ashwin Somasundara <
[email protected]> wrote:

Thank you. The files were created by Hive, this might happen (and i think
its a valid scenario) when Hive is not able to split the input data into
that many number of buckets. Ex: We might get just one record for a day (in
Datawarehouse terms), so there will be one bucket/file for that day. The
properties you mentioned, do they exist today? I get errors when i try to
use these properties:

2017-07-26T13:09:55.663Z WARN main Bootstrap UNUSED PROPERTIES
2017-07-26T13:09:55.663Z WARN main Bootstrap hive.empty-bucketed-partitions.enabled=true
2017-07-26T13:09:55.663Z WARN main Bootstrap hive.multi-file-bucketing.enabled=true
2017-07-26T13:09:55.663Z WARN main Bootstrap
2017-07-26T13:09:55.838Z ERROR main com.facebook.presto.server.PrestoServer Unable to create injector, see the following errors:

1) Configuration property 'hive.empty-bucketed-partitions.enabled=true' was not used
at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:235)

2) Configuration property 'hive.multi-file-bucketing.enabled=true' was not used
at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:235)

2 errors
com.google.inject.CreationException: Unable to create injector, see the following errors:

1) Configuration property 'hive.empty-bucketed-partitions.enabled=true' was not used
at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:235)

2) Configuration property 'hive.multi-file-bucketing.enabled=true' was not used
at io.airlift.bootstrap.Bootstrap.lambda$initialize$2(Bootstrap.java:235)

2 errors
at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:466)
at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:155)
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:107)
at com.google.inject.Guice.createInjector(Guice.java:96)
at io.airlift.bootstrap.Bootstrap.initialize(Bootstrap.java:242)
at com.facebook.presto.hive.HiveConnectorFactory.create(HiveConnectorFactory.java:107)
at com.facebook.presto.connector.ConnectorManager.createConnector(ConnectorManager.java:304)
at com.facebook.presto.connector.ConnectorManager.addCatalogConnector(ConnectorManager.java:193)
at com.facebook.presto.connector.ConnectorManager.createConnection(ConnectorManager.java:185)

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/prestodb/presto/issues/8557#issuecomment-318048851,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AQhEuE6kgHee1RUgnGaAmHKQ1r-W4NtEks5sRztagaJpZM4OeOth
.

--
Sajumon Joseph
303-378-0593
[email protected]

It looks not a good idea to switch us to teradata. First, teradata's seems to be old versions of prestoDB at the moment (version 167t ), we use 0.181. Second, we do not want to get this fixed but lots of issues unfixed again or new features become unavailable.

Can Prestodb team help us?

Something strange is going on. Can you try to reproduce it with a new table and give us the exact Hive commands to do so? Also, what version of Hive and Hadoop?

I tried with a new table.

presto:default> select * from TEST_HIVE_BUCKETS_AGAIN;
Query 20170811_180022_00003_iykxv failed: Hive table is corrupt. It is declared as being bucketed, but the files do not match the bucketing declaration. The number of files in the directory (5) does not match the declared bucket count (10) for partition: eff_dt=2017-02-01 08%3A15%3A59

This is the Table script and data insert script:

create table TEST_HIVE_BUCKETS_AGAIN (
COL1_NEW string,
COL2_NEW string
)
PARTITIONED BY (
  eff_dt timestamp)
  CLUSTERED BY( COL2_NEW) INTO 10 BUCKETS;

INSERT INTO TABLE TEST_HIVE_BUCKETS_AGAIN PARTITION (eff_dt)  Select 'TEST_COL1-1', 'TEST_COL1', from_unixtime(unix_timestamp('20170101081559999', 'yyyyMMddHHmmssSSS'));
INSERT INTO TABLE TEST_HIVE_BUCKETS_AGAIN PARTITION (eff_dt)  Select 'TEST_COL1-2', 'TEST_COL2', from_unixtime(unix_timestamp('20170101081559999', 'yyyyMMddHHmmssSSS'));
INSERT INTO TABLE TEST_HIVE_BUCKETS_AGAIN PARTITION (eff_dt)  Select 'TEST_COL1-3', 'TEST_COL1', from_unixtime(unix_timestamp('20170101081559999', 'yyyyMMddHHmmssSSS'));
INSERT INTO TABLE TEST_HIVE_BUCKETS_AGAIN PARTITION (eff_dt)  Select 'TEST_COL1-4', 'TEST_COL2', from_unixtime(unix_timestamp('20170101081559999', 'yyyyMMddHHmmssSSS'));
INSERT INTO TABLE TEST_HIVE_BUCKETS_AGAIN PARTITION (eff_dt)  Select 'TEST_COL1-5', 'TEST_COL1', from_unixtime(unix_timestamp('20170101081559999', 'yyyyMMddHHmmssSSS'));
INSERT INTO TABLE TEST_HIVE_BUCKETS_AGAIN PARTITION (eff_dt)  Select 'TEST_COL1-1', 'TEST_COL1', from_unixtime(unix_timestamp('20170201081559999', 'yyyyMMddHHmmssSSS'));
INSERT INTO TABLE TEST_HIVE_BUCKETS_AGAIN PARTITION (eff_dt)  Select 'TEST_COL1-2', 'TEST_COL2', from_unixtime(unix_timestamp('20170201081559999', 'yyyyMMddHHmmssSSS'));
INSERT INTO TABLE TEST_HIVE_BUCKETS_AGAIN PARTITION (eff_dt)  Select 'TEST_COL1-3', 'TEST_COL1', from_unixtime(unix_timestamp('20170201081559999', 'yyyyMMddHHmmssSSS'));
INSERT INTO TABLE TEST_HIVE_BUCKETS_AGAIN PARTITION (eff_dt)  Select 'TEST_COL1-4', 'TEST_COL2', from_unixtime(unix_timestamp('20170201081559999', 'yyyyMMddHHmmssSSS'));
INSERT INTO TABLE TEST_HIVE_BUCKETS_AGAIN PARTITION (eff_dt)  Select 'TEST_COL1-5', 'TEST_COL2', from_unixtime(unix_timestamp('20170201081559999', 'yyyyMMddHHmmssSSS'));

The Describe Table -

hive> DESCRIBE FORMATTED test_hive_buckets_again;
OK
# col_name              data_type               comment

col1_new                string
col2_new                string

# Partition Information
# col_name              data_type               comment

eff_dt                  timestamp

# Detailed Table Information
Database:               default
Owner:                  root
CreateTime:             Fri Aug 11 17:53:14 UTC 2017
LastAccessTime:         UNKNOWN
Retention:              0
Location:               hdfs://<IP_COMMENTED_OUT>:8020/user/hive/warehouse/test_hive_buckets_again
Table Type:             MANAGED_TABLE
Table Parameters:
        transient_lastDdlTime   1502473994

# Storage Information
SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat:            org.apache.hadoop.mapred.TextInputFormat
OutputFormat:           org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed:             No
Num Buckets:            10
Bucket Columns:         [col2_new]
Sort Columns:           []
Storage Desc Params:
        serialization.format    1
Time taken: 0.875 seconds, Fetched: 31 row(s)

I am using Amazon EMR cluster to test :

$: hive --version
Hive 2.1.1-amzn-0
Subversion git://ip-10-111-226-159/workspace/workspace/foscraig.bigtop.release-rpm-5.6.0/build/hive/rpm/BUILD/apache-hive-2.1.1-amzn-0-src -r 154f1ef53e2d6ed126b0957d7995e0a610947608
Compiled by ec2-user on Thu Jun 1 05:57:14 UTC 2017
From source with checksum 292cb49f57dc3b8a5d71afc7f343da08
$: hadoop version
Hadoop 2.7.3-amzn-2
Subversion [email protected]:/pkg/Aws157BigTop -r 154f1ef53e2d6ed126b0957d7995e0a610947608
Compiled by ec2-user on 2017-06-01T05:34Z
Compiled with protoc 2.5.0
From source with checksum 1833aada17b94cfb94ad40ccd02d3df8
This command was run using /usr/lib/hadoop/hadoop-common-2.7.3-amzn-2.jar

Presto Version: 0.182

There are multiple cases where the number of files will be different than the number of bucket:

  1. Recent version of Hive will not create file for empty bucket when running in Tez mode, which is the default. So if you have fewer records than bucket, or your data distribution is skewed, there will be fewer files than buckets.
    https://issues.apache.org/jira/browse/HIVE-13040?focusedCommentId=15159223&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15159223

  2. "Insert into" query will create additional files for each execution. So in this case each bucket can consist of multiple files (sharing the same prefix)

All in all, the requirement that the number of files is equal to the number of bucket is too strict. I think Presto should at least fallback to non-bucket execution if it can't handle variable number of files per bucket yet.

We encounter similar issue when upgrading from EMR-4.3.0/Hive-1/Presto-0.139 to EMR-5.7.0/Hive-2/Presto-0.170
We are trying to reprocess all partitions of bucketed tables where number of files do not match declared bucket count. For cases where actual number of rows are smaller than number of buckets it looks like we have to force hive to create the empty files. While for multiple insert to same partition, we will have to consolidate all, and ensure all future inserts into bucketed tables will be insert overwrite (acceptable in our use case).

Obviously the 2 teradata properties mentioned above, ie
hive.multi-file-bucketing.enabled=true
hive.empty-bucketed-partitions.enabled=true
would really help, in fact we used them when we first looked into upgrading presto.

Is there a reason why these were not brought on?

@dikei I think number 1 above is why it is failing. Hive will be running with a Tez backend on EMR therefore it won't create the empty files.

@yanyani989 The reason you hit this when upgrading from emr-4.3.0 -> emr-5.7.0 is due to the switch in the hive execution engine from MR -> Tez as explained above.

I think this is something that Presto should be able to work with considering Hive is okay with not creating empty files when Tez execution mode is enabled (they are doing this to optimize).

@z-york Yes, we did more test on trying to reload these bucketed partitions, here are what we observed:
EMR4/Hive1

  • insert overwrite from/to same partition doesn't work as it wipes partition first
  • bucketed to bucketed copy doesn't create the empty files (seems to create same number of files as source partition), so a temp non-bucketed table must be used
  • creates the empty files when inserting from non-bucketed

EMR5/Hive2

  • using tez by default, which skips creating empty buckets, causing problem with EMR57 presto
  • insert overwrite from/to same partition works as it create some staging temp files/folder (makes reloads slightly easier)
  • not creating empty files regardless of inserting from same bucketed or non-bucketed temp if using tez
  • if using MR then will create the empty files, but Hive2 warns about MR may be dropped in future release

I have also encountered this problem。
this is my architecture.
flume-->hive(hdfs)-->presto
hive create table :
create table test (bytes_in int,bytes_out int,device_id string,device_type string,host string,latency int,level string,method string,msg string,path string,referer string,remote_ip string,response_code int,route string,status int,type string,uri string,user_agent string,user_id bigint,time string)
PARTITIONED BY(year string,month string,day string)
clustered by (user_id) into 5 buckets
stored as orc
TBLPROPERTIES ("transactional"="true")
presto select error:
select count(*) from test;
Query 20171023_063934_00016_bqi4f failed: Hive table is corrupt. It is declared as being bucketed, but the files do not match the bucketing declaration. Found sub-directory in bucket directory for partition: year=2017/month=10/day=23

I also encounter this problem. Hive run in tez mode do not create empty file, for e.g.:

I have 32 bucket in ddl, and an insert produces below data files:

/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000001_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000002_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000003_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000004_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000006_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000008_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000009_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000010_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000012_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000013_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000014_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000015_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000016_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000017_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000018_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000021_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000022_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000023_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000024_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000025_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000026_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000027_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000028_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000029_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000030_0
/user/hive/warehouse/myds.db/demo_data/day=2017-11-14/000031_0

I think it is an compitable bug.

In case it helps, the latest version from Starburst has the code fix for
the above scenario.
Check out the release notes for more details -
https://docs.starburstdata.com/latest/release/release-0.195-e.html

On Mon, Apr 2, 2018 at 8:11 AM, rabinnh notifications@github.com wrote:

I've been working on this issue for a week, and have posted and received
some answers in both the Hive and Presto mailing-list/forums. Here's what I
have found out:

  • AFAIK, there is no way to force Hive with the Tez engine (and mr is
    deprecated) to create all the buckets. So no help from Hive.
  • There is no way to force Presto to continue when not all the buckets
    are present (bucket number does not match metastore). So no help from
    Presto.

So we would seem to be at an impasse. If any partition in the table
doesn't have a full complement of buckets, a Presto query will fail. I can
insert with the mr engine, but it is much slower and there is no way to
tell if it will continue to be supported, since it is deprecated. We run
into this in our test environment, but we cannot confidently deploy to
production because we can't chance that any particular hour will result in
fewer buckets in a partition. This makes Presto too fragile for our use
when using buckets.

The only "solution" that I can think of is to do away with bucketing
completely and to depend solely on ORC indexes, which seems like a pretty
sub-standard solution.

My guess is that Facebook would almost never run into the issue, because
there's likely so much data that all buckets are always created, even with
Tez.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/prestodb/presto/issues/8557#issuecomment-377937400,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AQhEuKjUexVWM7OCbxkoVsyNJE4pLEGTks5tkjF7gaJpZM4OeOth
.

--
Sajumon Joseph
303-378-0593
[email protected]

We use AWS EMR, so it doesn't help us, but interesting. So we know it can be done.

If you do the insert with Presto instead of Hive/Tez all the files are created. The only caveat is that when you create the table you can't use SORTED BY, as Presto cannot query clustered, sorted tables.

@rabinnh To clarify, Presto can query bucket sorted tables, but cannot insert into them.

Fair enough. But as the crux of my issue is inserting into a bucketed table such that Presto will always be able to query it, doing away with the SORTED BY part of the DDL is a fair tradeoff.

Thanks for the clarification though; someone who finds this thread will now have accurate information.

@rabinnh Agreed. Thanks for raising that point here.

For everyone else, there is some additional discussion on #10301. We are looking at ways to improve this so we can be compatible when possible.

With EMR 5-21 (Presto 0.215), still got this issue: Query 20191107_222140_00006_rf89j failed: Hive table 'dev.wifi_logs' is corrupt. The number of files in the directory (256) does not match the declared bucket count (64) for partition: date_key=2019-11-05

Are there any configurations to ignore this check?

With EMR 5-21 (Presto 0.215), still got this issue: Query 20191107_222140_00006_rf89j failed: Hive table 'dev.wifi_logs' is corrupt. The number of files in the directory (256) does not match the declared bucket count (64) for partition: date_key=2019-11-05

Are there any configurations to ignore this check?

Did you try Starburst's Presto in the marketplace too?
The company employs the bulk of former Presto people of Teradata so their distribution might manage the parameters.

Was this page helpful?
0 / 5 - 0 ratings