Amazon Athena (https://aws.amazon.com/athena/) is based on Hive/HiveQL/Presto and is provided by Amazon. It is essentially BigQuery in AWS land.
JDBC driver and HiveQL compatibility are provided out of the box. Presto provide low-latency, interactive querying and it is much better suited than the Hive support requested in #2157.
Fingers crossed and interested to see if there is any Athena adoption among Metabase users this early (Athena was announced a couple weeks ago at AWS re:Invent 2016).
猬囷笍 Please click the 馃憤 reaction instead of leaving a +1 or 馃憤or updates? comment
what are the thoughts on support for this?
Hey @cwbeck. We generally prioritize support for databases based on community demand. Given the current number of upvotes for Athena, it's possible that might take a bit.
In the meantime, if you're feeling adventurous and have some experience with Clojure, you could try writing a driver yourself. @camsaul is our resident expert and can help with any questions you might have.
@kdoh I've not used Clojure. We currently use Scala. I don't suppose Clojure being on the JVM it could be written in Scala or Java and referenced in Clojure?
@cwbeck we're pretty stringent on database driver style + testing, so it should be in clojure. Also note that unless you're committing to supporting the driver in scala/java indefinitely, we'll be on the hook for any future modifications + changing our build process.
@kdoh Metabase looks promising so far and we, me and my collegeaus at AdGoji, want to start using it for Athena. We have some specific requirements, but for starters what we want is something generic. Even though we are a Clojure company, to save ourselves some time and to sponsor this project as well, we are thinking of outsourcing this integration work.
I'll send a message privately for a quote on this work. If others are interested in sharing the sponsoring please let me know.
Athena on Metabase would be great. I know other open source tools such as re:dash(python based), superset(python based), and piwik all support Athena if you did need something now. But to my understanding its just a specialized (hive/presto) JDBC driver provided by amazon.
I would love to have this. Querying audit logs in S3 files would be awesome!
I've been investigating about this: we can currently query S3 through "Redshift Spectrum", it's a very simple config that needs to be done in redshift so each time an external table is queried, redshift sends that query to Athena. I believe this is the simplest solution till now!
@paoliniluis
Interesting approach, can you give us some feedbacks about it ?
@paoliniluis that is actually a great idea. but as @julienba I'm also curious to know how you've got on with that
So I haven't had any luck using spectrum in metabase yet, am i missconfig-ing ? It doesn't like the whole external schema concept too much. Also to note RS spectrum does not support JSON yet. Its kinda hidden in the fine Print but its just csv pqt and orc. Correct me if I'm wrong :)
@julienba @cwbeck the setup is pretty straightforward, as you just need to unload the big, fact tables into S3 with the UNLOAD command. The query times are very similar with CSV files, so the real speed gain would be if the files are in presto format and also zipped. The tech works but it needs a lot of preparation to be used in its full capacity. Ping me if needed
+1, since more and more people use athena as a BI tool, and AWS provided a driver as well, it's really a good proposal. hope you guys have it considered.
thanks so much:)
+10
Can this JDBC driver be used as a starting point at least?
https://s3.amazonaws.com/athena-downloads/drivers/AthenaJDBC41-1.1.0.jar
+1
What's the current situation of this? Anything has been done so far? Is there any starting point?
We're a Clojure company we could colaborate.
@ricardoekm see the #5439 PR referenced above. Looks like something you want to take for a spin 馃槂
Awesome @jornh. Thanks!
Please provide support for Redshift Spectrum also.
@sumit1294 Metabase supports Redshift which means it supports external tables that are backed by Redshift Spectrum.
@rgabo I tried executing query through Metabase but getting permission denied error:
ERROR: permission denied for schema
P.S. Have granted schema access to the Metabase user.
grant SELECT on ALL TABLES IN SCHEMA spectrum_prod to metabase_r;
Seems like it has no support for spectrum just like Redash.
https://discuss.redash.io/t/support-for-amazon-redshift-external-tables/1178
I really don't think redshift spectrum is a valid workaround. Its designed to utilize the compute resources of companies that already have made the investment in a redshift cluster to be able to ad hoc query over S3 data. Athena support is really needed to allow for ad hoc querying of S3 data. The new S3 Select service could also be useful- its still in preview but doesn't require any hive DDLs like Athena. http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectSELECTContent.html
At a high level this is the cheapest way to get to ad hoc querying of flat files ala Tableau IMHO. Just drop a file into a bucket and add it as a database. ETL is painful stuff, and this could help side step much of the mess.
Anyone have any luck with this yet? or Using RS Spectrum? I know with the announcement of S3 select this might get put on the back burner.
Need AWS Athena
Is the party line position that the way to make this happen is to write a JDBC driver in Clojure?
EDIT:
Looks like there is a PR, #5439, which is getting close!
@samhavens Yeah, but there's at least two issues that prevents PR #5439 from being merged:
prepare-value function call from the athena driver. Yet, it is probably highly insufficient (ping @camsaul, is it?). I don't know much about clojure, but having a clearer idea of what has to be done would maybe encourage contributions from me or somebody else.@gilbsgilbs to the second point, how can we help? I can't imagine the testing usage costing more than a few pennies a month. I feel like a one time payment of $100 should cover this in perpetuity...
We're also very eager for Athena drive. We can contribute if that's necessary to get CI running and the build to be merged.
If you're taking votes, I'd like to see Athena support also.
I would love to see Athena support!
This would be awesome
Hi, guys! We've been running Metabase locally with Athena driver. It is awesome!! But can't deploy into production because internal rules that don't allow forks to be published... So wondering when we can have Athen available in Master!
Let us know if the community can be of any help!!
@jonesmadrugaGH that would be amazing
Seems to be plenty of interest in this integration, but I am unsure if @julienba which had the original PR is still actively pursuing it. Maybe it would be better to create a new Pull Request from the modified branch by @ricardoekm or @gilbsgilbs, update with master and have a new look at the code?
@larseen I agree but I'll be too busy in the upcoming weeks to really spend much time on it (especially given that I'm far from fluent in Clojure). But one can totally take my branch as base to make a PR. If somebody takes it, they would probably need to merge @ricardoekm's work on this branch as well, which brings nested fields support. Once this is done, I believe the remaining work is to fix the tests and find a decent way to get the CI running without too much maintenance overhead.
cc @OlivierMns, you might also be interested.
It also seems like the driver has been updated, just tested the new driver and it is not compatible, something AWS also states.
The current JDBC driver version 2.x is not a drop-in replacement of the previous version of the JDBC driver, and is not backwards compatible with the JDBC driver version 1.x that you used before.
https://aws.amazon.com/about-aws/whats-new/2018/04/amazon-athena-updated-jdbc-driver-launch/
I'm not that proficient in Clojure either, but I could create a setup for CI, if someone gets onboard to fix this feature and create a decent Pull Request.
Shouldn't need to use a JDBC driver, their API supports queries directly:
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/athena/AmazonAthenaClient.html#startQueryExecution-com.amazonaws.services.athena.model.StartQueryExecutionRequest-
We've updated the driver in our branch, this fixs array support.
Also we've merged with 0.29 and added @gilbsgilbs contributions.
We'll make a new pull request to Metabase.
Sorry I didn't get it ,@ricardoekm so 0.29 has Athena support ? That would be great news.
We have a fork of 0.29 with Athena support (https://github.com/B2W-BIT/metabase), still working on the pull requests (some tests are failing).
+1 to Athena Support.
@ricardoekm If we would like to test your fork do you have a CI setup for a nightly build ? Just a local test would be great.
Hi @whatnick,
We're currently running the tests against our AWS Account, we're preparing an independent account to handle to the Metabase team tough.
@ricardoekm - i have compile and run your fork https://github.com/B2W-BIT/metabase
but i don't see at the UI and option to add Athena

does someone have answer ?
@hanangithub you can follow this guide
The basic is to create a folder _plugins_ in your app root and put the driver jar like this:
./metabase.jar
./plugins/AthenaJDBC42_2.0.2.jar
Important: run it from the app root path.
Happens with me build and run from repo root and it don't found the driver jar
~java -jar /target/uberjar/metabase.jar~ DO NOT WORK!
./bin/build
cd target/uberjar
java -jar metabase.jar
:+1:
Anyway, you can download and run from BIT-B2w pre-release
@ricardoekm @thslopes etc. (if you didn鈥檛 see the AWS note already) bumping the Athena JDBC dependency version to 2.0.5 for a 2x or 5x-6x performance increase seems ... worthwhile:
https://docs.aws.amazon.com/athena/latest/ug/release-note-2018-08-16.html
With Metabase 0.30 just out the door, what do you guys think about aiming to get tests, and CI setup running so it can potentially land in 0.31 or 0.32? Any way for others to help at this stage?
Hi @jornh,
We'll check that. @thslopes went on vacation. We'll get back to this issue as soon as he returns.
Roger that! Hope he enjoys time off 馃尨
Hello, guys
Finally!
PR opened #8577
:heavy_check_mark: Athena Driver 2.0.5
Please, metafriends, take a look.
Any idea when this PR will be merged in metabase:master?
any update on the merge and release?
would love for this to be merged and released!!!!
Please merge!
We've been trying to submit a PR, however we've being outpaced by Metabase team. We've fixed the merge and tests issues in 0.27. Then we merged and fixed tests & conflicts with 0.31, then we tried to merged again with master, however we've faced more conflicts and tests failing.
Given this situation we've asked help to Metabase team, or if anyone in the community is able to lay a hand would be welcome.
Any updates on this?
If the driver is being shipped as a plugin, can it please be included in the metabase docker image? I personally don't want to be in the business of packaging. I specifically chose metabase due to its very clean docker usage, which was significantly better than any of the competing OSS BI tools.
i'm going to get fired if this doesn't get shipped.
Any update? It's a really big deal for every one here! :)
Hi @ricardoekm, sorry we've changed the driver interface around over the last few months. As the person responsible for all these changes I can promise that there are no future plans to change the current interface any further so hopefully any work done against it will need no changes for a very long time.
With Metabase's new plugin system it would actually be theoretically possible to load plugins without restarting your Metabase instance, we're considering something like a "Metabase Plugin Store" where you can enable plugins directly from the Admin Page.
Before you can hot-load a Athena driver plugin of course the driver would have to be ready. @ricardoekm might I recommend posting a link to what you have so other members of the community can help contribute to a working driver?
Any news?
There's an AWS employee who says he's working a driver: https://github.com/metabase/metabase/pull/8577#issuecomment-481160698
I've released an early version of the driver here: https://github.com/dacort/metabase-athena-driver/releases/tag/v0.0.3
Feel free to check it out and file any issues you might come across.
We've submitted a PR with the missing features of the driver: https://github.com/dacort/metabase-athena-driver/pull/10
It seems with this PR the driver is finished. We're rolling out to production this week to test with a real-world scenario.
Has anybody been using the new driver? How is query performance?
Is the driver working then? Should this issue be closed pointing to https://github.com/dacort/metabase-athena-driver ?
Hi everyone, just landing here to ask for guidelines to configure Metabase with Amazon Athena. As per some research I've downloaded the metabase-athena-driver v0.2.1(from here: https://github.com/dacort/metabase-athena-driver/releases/tag/v0.2.1) and
Metabase v0.33.4 (from here: https://github.com/metabase/metabase/releases/tag/v0.33.4) since metabase's Athena driver says it was tested with that version of Metabase.
I put the driver into plugin directory but when I startup Metabase I only see this line in logs:
12-10 08:59:47 [1mINFO plugins.classloader[0m :: Added URL file:/C:/software/Metabase/plugins/athena.metabase-driver.jar to classpath
But when I try to create a new database there is no Athena option showed on select field!
Can someone provide me some help to configure the Metabase's Athena driver in Metabase?
Thanks in advance!
@czarbl
Are you defining MB_PLUGINS_DIR manually? It needs to point to a writable directory, and should be auto-populated with the built-in drivers, when Metabase starts.
You should see these lines too, towards the end of driver registration:
12-10 13:31:41 DEBUG plugins.lazy-loaded-driver :: Registering lazy loading driver :athena...
12-10 13:31:41 INFO metabase.driver :: Registered driver :athena (parents: [:sql-jdbc]) 馃殮
Please use the forum for questions and troubleshooting: https://discourse.metabase.com/
FROM metabase/metabase:v0.33.3
ENV MB_JETTY_PORT 8080
ADD Procfile /app/source
ENV MB_PLUGINS_DIR /plugins
RUN cd ${MB_PLUGINS_DIR} && wget https://github.com/getfiit/metabase-athena-driver/releases/download/v0.1.1/athena.metabase-driver.jar
We are using the following docker file extending the official one with the metabase plugin.
We have a custom build of the driver for workgroup support which actually got merged in v0.1.1 so there is no reason not to use the official one
Hi flamber, "apparently" the problem I described could be a Windows issue because I replicate the same steps in an Ubuntu environment and it worked! Now I can see the Athena option when I create a new Database in Metabase.
Hi netproteus, after made some minor changes to the Dockefile I got your solution working:
FROM metabase/metabase:v0.33.4
ENV MB_JETTY_PORT 8080
ENV MB_PLUGINS_DIR /plugins
RUN cd ${MB_PLUGINS_DIR} && wget https://github.com/getfiit/metabase-athena-driver/releases/download/v0.1.1/athena.metabase-driver.jar
# expose port
EXPOSE 8080
Also with this solution I can choose the Athena driver when I create a new Database in Metabase!
@czarbl use the latest @dacort one all the changes in our (@getfiit) build plus more are in that one
@czarbl Every time you post a comment, a lot of people are getting notifications. The Github is an issue/feature tracker, not the right place to get support.
Please use the forum for questions and troubleshooting: https://discourse.metabase.com/
Any updates?
why is this still open?
From my understanding there is a plugin to support this:
https://github.com/dacort/metabase-athena-driver
What is ths scope of this to get closed?
Most helpful comment
Any idea when this PR will be merged in metabase:master?