Hi,
I want to truncate (or even drop) an existing table in SQL server. I didn't find any dedicated function in pandas.io.sql. So I tried to create an empty data frame just to call:
df1 = pd.DataFrame()
df1.to_sql(out_tbl, engine, if_exists='replace', index=False)
This code does drop the table but then generates exception from sqlalchemy as it tries to re-create an empty table. I can catch and ignore it, and the next call to_sql() with if_exists='append' will create the table correctly, but this is pretty ugly.
Another way I tried is to clear all the rows from the original data frame, leaving the columns:
df1 = df.drop(df.index)
df1.to_sql(out_tbl, engine, if_exists='replace', index=False)
This almost works: it truncates the table but then insert a single record with all fields as NULL...
You can see the full description at http://stackoverflow.com/questions/26205057/how-can-i-truncate-a-table-using-pandas#comment41130846_26205057
For dropping a table, you can:
meta = sqlalchemy.MetaData()
meta.reflect(bind=engine)
table = meta.tables['table_name']
table.drop(engine)
pdsql = pd.io.sql.SQLDatabase(engine)
pdsql.drop_table('table_name')
If you want to delete all rows (but not drop the table), that is currently not possible with pandas, but I think delete
does this in sqlalchemy:
meta = sqlalchemy.MetaData()
meta.reflect(bind=engine)
table = meta.tables['table_name']
table.delete(engine)
But maybe we should think about how we could integrate this in the pandas api.
Thanks for the quick and detailed response!
Indeed in this case I want to truncate it (i.e. delete all rows and leave it empty), and internally I'm sure it's already implemented as to_sql supports if_exists='replace', so I think that either use to_sql with empty df or explicit API to truncate it could be useful
Adi
Is it possible to make pandas delete all the rows instead of dropping the table for if_exists='replace'?
Having to drop the table results in deadlocks when there's a simultaneous SELECT going on, while a simple deletion would avoid such deadlocks.
Closing for now as Won't Fix. PRs welcome
Most helpful comment
Is it possible to make pandas delete all the rows instead of dropping the table for if_exists='replace'?
Having to drop the table results in deadlocks when there's a simultaneous SELECT going on, while a simple deletion would avoid such deadlocks.