0.26.0 - following docker-compose instructions
Be able to run World Bank dashboard.
I'm not able to run any query that state particular wb_health_population column as in Postgres columns created with " " should be referenced this way as well. The exact same problem as here: https://stackoverflow.com/questions/24253814/cant-select-an-existing-column-in-postgresql
2018-09-29 14:00:38,127:INFO:root:SELECT SP_POP_TOTL
FROM wb_health_population
LIMIT 1
2018-09-29 14:00:38,128:ERROR:root:column "sp_pop_totl" does not exist
LINE 1: SELECT SP_POP_TOTL
^
Traceback (most recent call last):
File "/home/work/incubator-superset/superset/sql_lab.py", line 185, in execute_sql
db_engine_spec.execute(cursor, query.executed_sql, async_=True)
File "/home/work/incubator-superset/superset/db_engine_specs.py", line 376, in execute
cursor.execute(query)
psycopg2.ProgrammingError: column "sp_pop_totl" does not exist
LINE 1: SELECT SP_POP_TOTL
Follow docker-compose instructions and try to render WB dashboard.
I hope that it's not because of the fact I'm still using 0.26. Either way I would be very happy to help with improving test data set generation or with Docker images maintenance.
The fix I'd like to propose is to lower column names before running pandas' to_sql()
> pd.DataFrame({'A': range(10)}).to_sql('table_2', engine)
> pd.DataFrame({'a': range(10)}).to_sql('table_2', engine)
...
> cur.execute("select a from table_1")
> print(cur.fetchall())
---------------------------------------------------------------------------
ProgrammingError Traceback (most recent call last)
<ipython-input-54-a6e1e7bd84f4> in <module>()
----> 1 cur.execute("select a from table_1")
2 print(cur.fetchall())
ProgrammingError: column "a" does not exist
LINE 1: select a from table_1
...
> cur.execute("select a from table_2")
> print(cur.fetchall())
[(0,), (1,), (2,), (3,), (4,), (5,), (6,), (7,), (8,), (9,)]
I'm having similar problems. Do you have a fix worked out?
Changing the column name to include double quotes works.
SELECT country_name AS country_name, SUM(SP_POP_TOTL) AS "sum__SP_POP_TOTL"
FROM wb_health_population
WHERE year >= '2014-01-01 00:00:00' AND year <= '2014-01-02 00:00:00' GROUP BY country_name ORDER BY "sum__SP_POP_TOTL" DESC
LIMIT 50000
to
SELECT country_name AS country_name, SUM("SP_POP_TOTL") AS "sum__SP_POP_TOTL"
FROM wb_health_population
WHERE year >= '2014-01-01 00:00:00' AND year <= '2014-01-02 00:00:00' GROUP BY country_name ORDER BY "sum__SP_POP_TOTL" DESC
LIMIT 50000
This is a Postgres-specific issue related to column names - Postgres converts all column names to lower case unless they are quoted. So, expected behaviour, but something we can account for. Nope, looks a bit more complicated, and may have to do 'MetricsControl'.
I'm working on a PR for this.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Any updates on the fix please?
I encountered a maybe related issue displaying a preview of the entire table in SQL Lab on wb_health_population. It was on 0.31 on Ubuntu 18.04 with Anaconda Python in a virtualenv.
@villebro is this related to some of your work?
I believe this is loosely related to #7131, as Postgres is sometimes very picky about the exact case of column names. I think the correct fix would be to compile the column names with the specific dialect when constructing the expressions for the sample data sources. I can look into this.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.
Most helpful comment
Changing the column name to include double quotes works.
SELECT country_name AS country_name, SUM(SP_POP_TOTL) AS "sum__SP_POP_TOTL"
FROM wb_health_population
WHERE year >= '2014-01-01 00:00:00' AND year <= '2014-01-02 00:00:00' GROUP BY country_name ORDER BY "sum__SP_POP_TOTL" DESC
LIMIT 50000
to
SELECT country_name AS country_name, SUM("SP_POP_TOTL") AS "sum__SP_POP_TOTL"
FROM wb_health_population
WHERE year >= '2014-01-01 00:00:00' AND year <= '2014-01-02 00:00:00' GROUP BY country_name ORDER BY "sum__SP_POP_TOTL" DESC
LIMIT 50000