When Redash is running on an EC2 instance, that instance may be able to access Athena via its IAM role and therefore the connector configuration shouldn't require AWS credentials. There is a minor code change (patch forthcoming) required to support this so that the "None" credential values will allow the underlying library to fall back to the IAM role.
To reproduce:
1) Deploy Redash on an EC2 instance in AWS
2) Set up an Athena table in AWS
3) Grant the EC2 instance access to the Athena table via IAM roles
4) Attempt to create a data source in Redash that connects to the athena table, but don't provide any credentials. It will not connect.
Redash 8.0beta on EC2 (Centos) Running under docker compose
It works for me in Redash 7. Looking at the code in 8 if you set the environment variable ATHENA_OPTIONAL_CREDENTIALS to False you should be fine:
OPTIONAL_CREDENTIALS = parse_boolean(os.environ.get('ATHENA_OPTIONAL_CREDENTIALS', 'true'))
I vote for full-featured role based AWS access.
This will perfectly applies to other AWS data sources if they appear.
@max-lobur you can enable role based access for the Athena data source (check the implementation for details).
The main reason this isn't enabled by default is that there are various questions about caching and scheduled queries that aren't answered yet when using role based access instead of a single user.
@arikfr LMK if you need a hand with IAM, drop a link to a discussion. I can't imagine any issues with it.
If an instance has permission to assume a role, it can assume it any time without user participation (unless role requires MFA). Any cache or delayed execution should work flawlessly, I use this a lot.
@arikfr Could you point me on the right doc or so?
I running the image redash/redash:7.0.0.b18042 within one pod server, scheduler, redis:5.0-alpine and external rds PostgreSQL. Everything working as expected fine but we want to expand as @max-lobur mentioned sources of the data and be able to query and add Athena as additional ds.
From the pod as from the EC2 instance, we can reach our resources(s3, Athina) on role-based, but can't use Athina as an additional data source.
python manage.py ds new "athena" --type "athena" --options '{"aws_access_key": "", "aws_secret_key": "", "region": "us-west-2", "s3_staging_dir": "s3://our-athena-query-bucket/logs"}'
but in test we are getting error:
redash@redash-szvx9:/app# python manage.py ds test "athena"
[2019-12-23 16:27:42,958][PID:4703][INFO][root] Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt
[2019-12-23 16:27:42,977][PID:4703][INFO][root] Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
Testing connection to data source: athena (id=9)
[2019-12-23 16:27:44,740][PID:4703][ERROR][pyathena.common] Failed to execute query.
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/pyathena/common.py", line 154, in _execute
**request)
File "/usr/local/lib/python2.7/dist-packages/pyathena/util.py", line 57, in retry_api_call
return retry(func, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/tenacity/__init__.py", line 358, in call
do = self.iter(retry_state=retry_state)
File "/usr/local/lib/python2.7/dist-packages/tenacity/__init__.py", line 319, in iter
return fut.result()
File "/usr/local/lib/python2.7/dist-packages/concurrent/futures/_base.py", line 455, in result
return self.__get_result()
File "/usr/local/lib/python2.7/dist-packages/tenacity/__init__.py", line 361, in call
result = fn(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/local/lib/python2.7/dist-packages/botocore/client.py", line 661, in _make_api_call
raise error_class(parsed_response, operation_name)
ClientError: An error occurred (UnrecognizedClientException) when calling the StartQueryExecution operation: The security token included in the request is invalid.
Failure: An error occurred (UnrecognizedClientException) when calling the StartQueryExecution operation: The security token included in the request is invalid.
all options with env were tested but the same error in any attempt.
Most helpful comment
It works for me in Redash 7. Looking at the code in 8 if you set the environment variable
ATHENA_OPTIONAL_CREDENTIALStoFalseyou should be fine: