I can successfully create an Athena database; in the database console I can even see the tables in my database.
However when I try to create a new table I have an error. The output of the console looks like this:
2017-10-02 15:58:38,866:DEBUG:pyathenajdbc.connection:JVM path: /Library/Java/JavaVirtualMachines/jdk1.8.0_111.jdk/Contents/Home/jre/lib/jli/libjli.dylib
2017-10-02 15:58:38,867:DEBUG:pyathenajdbc.connection:JVM args: [u'-server', u'-Djava.class.path=/tmp/venv/lib/python2.7/site-packages/pyathenajdbc/AthenaJDBC41-1.1.0.jar']
log4j:WARN No appenders could be found for logger (com.amazonaws.athena.jdbc.AthenaDriver).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
2017-10-02 15:58:41,131:DEBUG:pyathenajdbc.converter:{'REAL': 7, 'CLOB': 2005, 'TIME_WITH_TIMEZONE': 2013, 'LONGVARBINARY': -4, 'CHAR': 1, 'TINYINT': -6, 'BIGINT': -5, 'REF_CURSOR': 2012, 'DATE': 91, 'VARBINARY': -3, 'NULL': 0, 'REF': 2006, 'LONGNVARCHAR': -16, 'SMALLINT': 5, 'VARCHAR': 12, 'DISTINCT': 2001, 'NUMERIC': 2, 'NVARCHAR': -9, 'INTEGER': 4, 'ARRAY': 2003, 'JAVA_OBJECT': 2000, 'DATALINK': 70, 'SQLXML': 2009, 'DECIMAL': 3, 'OTHER': 1111, 'TIMESTAMP_WITH_TIMEZONE': 2014, 'BIT': -7, 'NCHAR': -15, 'BINARY': -2, 'STRUCT': 2002, 'DOUBLE': 8, 'LONGVARCHAR': -1, 'FLOAT': 6, 'BOOLEAN': 16, 'BLOB': 2004, 'ROWID': -8, 'TIME': 92, 'NCLOB': 2011, 'TIMESTAMP': 93}
2017-10-02 15:58:41,137:DEBUG:pyathenajdbc.cursor:SELECT
table_schema,
table_name,
column_name,
data_type,
is_nullable,
column_default,
ordinal_position,
comment
FROM information_schema.columns
2017-10-02 15:58:43,384:ERROR:pyathenajdbc.cursor:Failed to execute query.
Traceback (most recent call last):
File "/tmp/venv/lib/python2.7/site-packages/pyathenajdbc/cursor.py", line 172, in execute
result_set = self._statement.executeQuery(query)
java.sql.SQLExceptionPyRaisable: java.sql.SQLException: Internal error
2017-10-02 15:58:43,386:ERROR:root:(pyathenajdbc.error.DatabaseError) java.sql.SQLException: Internal error [SQL: u'\n SELECT\n table_schema,\n table_name,\n column_name,\n data_type,\n is_nullable,\n column_default,\n ordinal_position,\n comment\n FROM information_schema.columns\n ']
Traceback (most recent call last):
File "/tmp/venv/lib/python2.7/site-packages/superset/connectors/sqla/views.py", line 243, in pre_add
table.get_sqla_table_object()
File "/tmp/venv/lib/python2.7/site-packages/superset/connectors/sqla/models.py", line 583, in get_sqla_table_object
return self.database.get_table(self.table_name, schema=self.schema)
File "/tmp/venv/lib/python2.7/site-packages/superset/models/core.py", line 715, in get_table
autoload_with=self.get_sqla_engine())
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/sql/schema.py", line 439, in __new__
metadata._remove_table(name, schema)
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__
compat.reraise(exc_type, exc_value, exc_tb)
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/sql/schema.py", line 434, in __new__
table._init(name, metadata, *args, **kw)
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/sql/schema.py", line 514, in _init
include_columns, _extend_on=_extend_on)
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/sql/schema.py", line 527, in _autoload
_extend_on=_extend_on
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 2045, in run_callable
return conn.run_callable(callable_, *args, **kwargs)
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1534, in run_callable
return callable_(self, *args, **kwargs)
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 372, in reflecttable
table, include_columns, exclude_columns, **opts)
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/engine/reflection.py", line 598, in reflecttable
table_name, schema, **table.dialect_kwargs):
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/engine/reflection.py", line 369, in get_columns
**kw)
File "<string>", line 2, in get_columns
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/engine/reflection.py", line 54, in cache
ret = fn(self, con, *args, **kw)
File "/tmp/venv/lib/python2.7/site-packages/pyathenajdbc/sqlalchemy_athena.py", line 145, in get_columns
} for row in connection.execute(query).fetchall()
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 939, in execute
return self._execute_text(object, multiparams, params)
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1097, in _execute_text
statement, parameters
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1189, in _execute_context
context)
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1402, in _handle_dbapi_exception
exc_info
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context
context)
File "/tmp/venv/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 470, in do_execute
cursor.execute(statement, parameters)
File "/tmp/venv/lib/python2.7/site-packages/pyathenajdbc/util.py", line 34, in _wrapper
return wrapped(*args, **kwargs)
File "/tmp/venv/lib/python2.7/site-packages/pyathenajdbc/util.py", line 24, in _wrapper
return wrapped(*args, **kwargs)
File "/tmp/venv/lib/python2.7/site-packages/pyathenajdbc/cursor.py", line 194, in execute
raise_from(DatabaseError(*e.args), e)
File "/tmp/venv/lib/python2.7/site-packages/future/utils/__init__.py", line 454, in raise_from
raise e
DatabaseError: (pyathenajdbc.error.DatabaseError) java.sql.SQLException: Internal error [SQL: u'\n SELECT\n table_schema,\n table_name,\n column_name,\n data_type,\n is_nullable,\n column_default,\n ordinal_position,\n comment\n FROM information_schema.columns\n ']
I think it has to do with how SQLAlchemy interacts with PyAthena, not necessarly with Superset... but I am not sure
Python 2.7.13
0.20.0
27:Flask-SQLAlchemy==2.1
61:SQLAlchemy==1.1.9
62:SQLAlchemy-Utils==0.32.16
48:PyAthenaJDBC==1.3.0
What does your connection string look like? What dialect are you using?
It looks like it raises calling the dialiect's get_table method. And it looks like its using INFORMATION_SCHEMA.columns behind the scene, which I think works from Presto, it may just be something around a reserved word not being set in the dialect, and used in that query to get the columns information.
Thank you for the swift reply, here is my connection string:
awsathena+jdbc://aws_access_key_id:[email protected]/schema?s3_staging_dir=staging_bucket
Regarding information_schema.colums it does not seem to exist in the system tables of Athena. It seems weird also that the query to that table (as it appears in logs) does not filter by table_name...
you may have to re-write that in the dialect and use SHOW COLUMNS instead:
http://docs.aws.amazon.com/athena/latest/ug/ddl/show-columns.html
Done here https://github.com/sdia/PyAthenaJDBC/commit/da61f6cbfdd83f154d8e12b1df516dbc804747a3
Currently talking with PyAthenaJDBC folks to submit PR (could not manage to make their tests suite work on my local env)
keeping you posted...
Looks like they have a .travis.yml in their repo, meaning you should be able to use Travis to run the test suite and it should just work. It's only a matter of configuring Travis to point to your fork.
I've also just run into this issue, will give @sdia's fork a try.
Same issue too, @paulbjensen, @sdia have you validated the fix ?
I try to modify the .py according to the fork without success
Warning: this issue has been inactive for 183 days and will be automatically closed on 2018-04-29 if there is no further activity. You can add the keepup label to the issue in order for this bot to keep it open.
When I try to connect to my Athena datasource from superset server I get this error:
ERROR: {"error": "Connection failed!\n\nThe error message returned was:\nCan't load plugin: sqlalchemy.dialects:awsathena.jdbc"}
Installed packages:
pip install "PyAthenaJDBC>1.0.9"
pip install "PyAthena>1.2.0"
Connection string used:
awsathena+jdbc://accesskey:[email protected]/schema?s3_staging_dir=my-bucket-name
can someone please help me what is going on here
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Most helpful comment
I've also just run into this issue, will give @sdia's fork a try.