Presto: Cannot access Hive view when Hive catalog name is not "hive"

Created on 2 Nov 2020  Â·  13Comments  Â·  Source: prestosql/presto

Starting from version 345 can't get hive views from Presto when catalog name not hive:
Caused by: java.sql.SQLException: Query failed (#20201102_203650_09667_pba5b): line 1:15: Failed analyzing stored view 'myhive.test.view_hive': line 2:6: Catalog 'hive' does not exist

It works in version 344

bug

All 13 comments

@lemdandk I suppose you renamed catalog hive → myhive, right?

@ebyhr No, For me it always was myhive(I have in catalogs myhive.propertiesas connector to metastore). View is stored in a database like:

 TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME |   OWNER    | OWNER_TYPE | RETENTION | SD_ID  | TBL_NAME |   TBL_TYPE   |          VIEW_EXPANDED_TEXT          |          VIEW_ORIGINAL_TEXT          | IS_REWRITE_ENABLED
--------+-------------+-------+------------------+------------+------------+-----------+--------+----------+--------------+--------------------------------------+--------------------------------------+--------------------
   34 |  1604361631 |    63 |                0 | lemdandk | USER       |         0 | 136834 | view_hms | VIRTUAL_VIEW | select * from test.test_data | select * from test.test_data | f
(1 row)

But for some reasons starting from 345 presto start to add hive as catalog. Output of SHOW CREATE VIEW myhive.test.view_hive:

CREATE VIEW myhive.test.view_hive AS
SELECT
  "event_time"
, "string_value"
, "part_date"
FROM
  "hive"."test"."test_data"

But for instance on 344 it works as expected it doesn't add anything

@lemdandk Thanks for your info. I could reproduce the issue and it looks bug.

RelToPrestoConverter#convert prepends hive catalog.
https://github.com/prestosql/presto/blob/09f1ad2fe1cb5f06e91308271af998666fad3692/presto-hive/src/main/java/io/prestosql/plugin/hive/ViewReaderUtil.java#L137-L138

cc: @laurachenyu

Do we know where this constant hive comes from?

Looks like it's actually HiveToRel, not RelToPresto. If you look at rel.getInput(0).getTable().getQualifiedName() there's already the "hive" prefix.

The constant comes from com.linkedin.coral.hive.hive2rel.HiveSchema#ROOT_SCHEMA

The catalog needs to be set inside of com.linkedin.coral.hive.hive2rel.parsetree.ParseTreeBuilder.Config. It currently defaults to HiveSchema.ROOT_SCHEMA. We need to set it to the Hive connector catalog. Unfortunately, this seems to require a new release of Coral, since this is private inside of HiveToRelConverter.

@phd3 has created a ticket for Coral (https://github.com/linkedin/coral/issues/18)

It's a horrible hack but we could do something like this in the mean time:

prestoSql = prestoSql.replace(format("\"hive\".\"%s\"", table.getDatabaseName()), format("\"%s\".\"%s\"", catalogName, table.getDatabaseName()));

This only works if a view only accesses tables within the same schema, and it has other problems.

@electrum Presto seems to use the catalog for the view as the default catalog while analyzing the view. https://github.com/prestosql/presto/blob/master/presto-main/src/main/java/io/prestosql/sql/analyzer/StatementAnalyzer.java#L2574

Do we have a preference in terms of whether coral should prepend a catalog or not? It seems to me that coral shouldn't need to know about the catalogs, since a hive view doesn't have that concept.

Presto seems to use the catalog for the view as the default catalog while analyzing the view. https://github.com/prestosql/presto/blob/master/presto-main/src/main/java/io/prestosql/sql/analyzer/StatementAnalyzer.java#L2574

correct. VIEW must refer to same table(s), regardless of current user session's catalog/schema.

Do we have a preference in terms of whether coral should prepend a catalog or not? It seems to me that coral shouldn't need to know about the catalogs, since a hive view doesn't have that concept.

it doesn't have to. Is it easy to make coral _not_ to prepend them?

@phd3 not appending it seems like a good option. We would then set the Hive connector catalog in the returned ConnectorViewDefinition, so that the SQL would be resolved with that catalog as the session catalog.

ConnectorViewDefinition contains a catalog/schema that is used as the session catalog/schema when resolving identifiers in the view. We did this originally to avoid needing to rewrite view definitions to contain fully qualified names.

Was this page helpful?
0 / 5 - 0 ratings