Incubator-superset: hive/impala can't show table schema

Created on 12 Aug 2019  路  5Comments  路  Source: apache/incubator-superset

I use url impala://host:port/ like this to connect to hive database , the connect is ok , and I can execute sql in the sql lab . But 'See table schema' don't seem right , it shows the databaseName not the tabels.See the screenshot for detail.

Screenshots

See table schema should show tables but actually the databaseName.And the
image
And when clicked at the item ,it comes out the error
image

the console:

impala.hiveserver2:GetOperationStatus: resp=TGetOperationStatusResp(status=TStatus(statusCode=0, infoMessages=None, sqlState=None, errorCode=None, errorMessage=None), operationState=5, sqlState=None, errorCode=0, errorMessage="org.apache.spark.sql.AnalysisException: Table or view not found: scannetwork.scannetwork; line 1 pos 14;\n'GlobalLimit 0\n+- 'LocalLimit 0\n +- 'Project [*]\n +- 'UnresolvedRelation scannetwork.scannetwork\n")

when I excute ' show tables', it shows the right result as follows

image

Environment:

Superset 0.28.1
node.js v10.15.0
npm 6.4.1
impyla 0.15.0+2.ge4f2146
SQLAlchemy 1.2.7

#bug

All 5 comments

Issue-Label Bot is automatically applying the label #bug to this issue, with a confidence of 0.77. Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

I guess the 'See table schema' just take the 1st row as tableName .However here turns out to be the databaseName ...
What can I do to fix it .

Hmm, according to the docs SHOW TABLES should only return the table name in the first column, not what you're seeing. Perhaps the fact that you're mixing Hive with Impala might be the cause?

https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_show.html#show_tables

well, it's a matter about sparksql .
it's finally solved by
edit site-packages/impala/sqlalchemy.py
from

def get_table_names(self, connection, schema=None, **kw):
    query = 'SHOW TABLES'
    if schema is not None:
        query += ' IN %s' % schema
    rreturn [tup[0] for tup in connection.execute(query).fetchall()]

to

    def get_table_names(self, connection, schema=None, **kw):
        query = 'SHOW TABLES'
        if schema is not None:
            query += ' IN %s' % schema
        return [t[1] if len(t) >1  else t[0] for t in connection.execute(query).fetchall()]

well, it's a matter about sparksql .
it's finally solved by
edit site-packages/impala/sqlalchemy.py
from

def get_table_names(self, connection, schema=None, **kw):
    query = 'SHOW TABLES'
    if schema is not None:
        query += ' IN %s' % schema
    rreturn [tup[0] for tup in connection.execute(query).fetchall()]

to

    def get_table_names(self, connection, schema=None, **kw):
        query = 'SHOW TABLES'
        if schema is not None:
            query += ' IN %s' % schema
        return [t[1] if len(t) >1  else t[0] for t in connection.execute(query).fetchall()]

the path 'site-packages/impala/sqlalchemy.py' doesn't exsit if you just install pyhive,. So how to edit it?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

amien90 picture amien90  路  3Comments

lenguyenthedat picture lenguyenthedat  路  3Comments

ylkjick532428 picture ylkjick532428  路  3Comments

dinhhuydh picture dinhhuydh  路  3Comments

sashank picture sashank  路  3Comments