There is a module to connect SQLAlchemy to BigQuery, but it only has partial support. I'd love to be able to connect to BigQuery using SQLAlchemy this way and be able to play with Caravel as was mentioned in that partial support issue.
To do this properly, we'd need to provide a module with an implementation of a SQLAlchemy dialect I would imagine we'd want to package this separately so as not to add a dependency on SQLAlchemy. Looking more closely at the dialect interface it looks like the database driver would need to implement the Python DB-API.
I'm considering this a low priority feature request, but do think it would be cool.
This is a good topic, very useful.
@tswast BigQuery's DML support is still marked as experimental: I would want to see some API stability commitment on the back-end before trying to implement DB-API semantics.
I'd like to use this on my projects. Are there plans to build support?
@jkilpatricksc I have a pull request open, but it's not complete. Working on it!
There is some tentative of plugging sql alchemy:
https://pypi.python.org/pypi/sqlalchemy_bigquery/
Having Big Query connector to SQL Alchemy would give a strong incentive to use Big Query
Maintainer on Superset here (https://github.com/airbnb/superset)
@tswast I'm curious to hear how far you made it. It seems like it should be fairly straightforward to mimic DBAPI and to define a SQLAlchemy dialect. Note that the dialect can be incomplete and limited to work for read, as [of course] you wouldn't use DML or expect to use an ORM on top BigQuery. Also there's a need to implement the metadata-fetching methods to allow us to fetch metadata around tables/views/columns/types.
For reference, the pyhive project (https://github.com/dropbox/PyHive) does exactly this (mimic DBAPI and defines a SQLAlchemy dialect) on top of Presto and Hive, and it works very well for Superset and other interfaces. It's really nice to have a common interface (sqlalchemy) to all databases for people building tools that are expected to work against many databases.
I paused my work on https://github.com/GoogleCloudPlatform/google-cloud-python/pull/2921 but am restarting it now. I'm glad to hear there's interest in this.
query.num_dml_affected_rows, needed to implement cursor.rowcount.The DB-API implementation https://github.com/GoogleCloudPlatform/google-cloud-python/issues/2921 is now merged.
I've filed https://github.com/GoogleCloudPlatform/google-cloud-python/issues/3603 to track SQLAlchemy support.
The documentation for the BigQuery DB-API is now published.
Most helpful comment
The DB-API implementation https://github.com/GoogleCloudPlatform/google-cloud-python/issues/2921 is now merged.
I've filed https://github.com/GoogleCloudPlatform/google-cloud-python/issues/3603 to track SQLAlchemy support.