In [3]: pd.show_versions()
INSTALLED VERSIONS
commit: 9a66dbb
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-32-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.15.0-62-g9a66dbb
nose: None
Cython: 0.20.1post0
numpy: 1.8.1
scipy: 0.13.3
statsmodels: None
IPython: 1.2.1
sphinx: None
patsy: None
dateutil: 2.2
pytz: 2014.7
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: 0.9.3
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: 0.9.8
pymysql: None
psycopg2: 2.5.3 (dt dec pq3 ext)
Here's an example stack trace when trying to insert data from a data frame with a column named CT Volume (cc)
:
Traceback (most recent call last):
File "infer_schemas.py", line 21, in <module>
df.to_sql(table, engine, index=False)
File "/usr/local/Cellar/python3/3.3.3/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/pandas/core/generic.py", line 956, in to_sql
index=index, index_label=index_label, chunksize=chunksize)
File "/usr/local/Cellar/python3/3.3.3/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/pandas/io/sql.py", line 534, in to_sql
chunksize=chunksize)
File "/usr/local/Cellar/python3/3.3.3/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/pandas/io/sql.py", line 1134, in to_sql
table.insert(chunksize)
File "/usr/local/Cellar/python3/3.3.3/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/pandas/io/sql.py", line 712, in insert
self._execute_insert(conn, keys, chunk_iter)
File "/usr/local/Cellar/python3/3.3.3/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/pandas/io/sql.py", line 687, in _execute_insert
conn.execute(self.insert_statement(), data)
File "/usr/local/Cellar/python3/3.3.3/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/sqlalchemy/engine/base.py", line 729, in execute
return meth(self, multiparams, params)
File "/usr/local/Cellar/python3/3.3.3/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/sqlalchemy/sql/elements.py", line 322, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/local/Cellar/python3/3.3.3/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/sqlalchemy/engine/base.py", line 826, in _execute_clauseelement
compiled_sql, distilled_params
File "/usr/local/Cellar/python3/3.3.3/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/sqlalchemy/engine/base.py", line 958, in _execute_context
context)
File "/usr/local/Cellar/python3/3.3.3/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/sqlalchemy/engine/base.py", line 1162, in _handle_dbapi_exception
util.reraise(*exc_info)
File "/usr/local/Cellar/python3/3.3.3/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/sqlalchemy/util/compat.py", line 182, in reraise
raise value
File "/usr/local/Cellar/python3/3.3.3/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/sqlalchemy/engine/base.py", line 928, in _execute_context
context)
File "/usr/local/Cellar/python3/3.3.3/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/sqlalchemy/engine/default.py", line 433, in do_executemany
cursor.executemany(statement, parameters)
KeyError: 'CT Volume (cc'
@hammer Thanks for the report. Can you give a bit more details? What database flavor? Small example?
A small test with sqlite works for me on python 2:
In [6]: engine = create_engine('sqlite:///:memory:')
In [7]: df = pd.DataFrame({'CT Volume (cc)':[1,2,3]})
In [8]: df
Out[8]:
CT Volume (cc)
0 1
1 2
2 3
In [9]: df.to_sql('test_parantheses', engine)
In [10]: pd.read_sql_table('test_parantheses', engine)
Out[10]:
index CT Volume (cc)
0 0 1
1 1 2
2 2 3
I'm using the postgres
engine
OK, can reproduce it with postgres
Hmm, I am thinking this is possibly a sqlalchemy bug, as until the latest step in pandas in the stacktrace, everything seems fine.
If I print data, self.insert_statement
in before execute
(https://github.com/pydata/pandas/blob/v0.15.1/pandas/io/sql.py#L687), I get:
[{'index': 0L, 'CT Volume (cc)': 1L}, {'index': 1L, 'CT Volume (cc)': 2L}, {'index': 2L, 'CT Volume (cc)': 3L}]
INSERT INTO test_parantheses (index, "CT Volume (cc)") VALUES (%(index)s, %(CT Volume (cc))s)
so that looks all OK. The table is also correctly created in postgres.
Same issue seems to be reported here: https://github.com/onyxfish/csvkit/issues/317 and here: http://stackoverflow.com/questions/25917741/sqlalchemy-with-postgres-insert-into-a-table-whose-columns-have-parentheses without solutions.
I reported it at sqlalchemy: https://bitbucket.org/zzzeek/sqlalchemy/issue/3245/postgres-insert-into-a-table-whith-column
So closing this, as for now, I think this is not a bug in pandas. But thanks for the report!
OK, reopening, as we _can_ fix this from the pandas side if we want by using 'key's (see detailed response at https://bitbucket.org/zzzeek/sqlalchemy/issue/3245/postgres-insert-into-a-table-whith-column)
.key won't work right now in every case. passing "paramstyle='format'" to create_engine() for Postgresql engines will do it.
Is this fixable from the pandas side?
hey wes -
So the approach taken on my end in https://bitbucket.org/zzzeek/sqlalchemy/issues/3245/postgres-insert-into-a-table-whith-column now works, so you'd use .key on the Column so that ends up being the parameter name and no problem:
from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class A(Base):
__tablename__ = 'a'
id = Column(Integer, primary_key=True)
ct_volume = Column("CT Volume (cc)", Integer, key="ct_volume")
e = create_engine("postgresql://scott:tiger@localhost/test", echo=True)
Base.metadata.drop_all(e)
Base.metadata.create_all(e)
s = Session(e)
s.add(A(ct_volume=12))
s.commit()
print(s.query(A.ct_volume).scalar())
so... that will work, it is probably possible in theory for SQLAlchemy to even do something like that automatically as the Oracle dialect does some translation of parameter names too for weird chars.
Most helpful comment
.key won't work right now in every case. passing "paramstyle='format'" to create_engine() for Postgresql engines will do it.