Pandas: Cannot save DataFrame with unicode to CSV

Created on 27 Jan 2012  路  4Comments  路  Source: pandas-dev/pandas

In [1]: from pandas import DataFrame
In [2]: df = DataFrame({u'c/\u03c3':[1,2,3]})
In [3]: df.to_csv('test')
---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
.../<ipython-input-3-9b2e5ea53beb> in <module>()
----> 1 df.to_csv('test')

.../lib/python2.7/site-packages/pandas-0.7.0.dev_88fcac5-py2.7-macosx-10.4-x86_64.egg/pandas/core/frame.pyc in to_csv(self, path, sep, na_rep, cols, header, index, index_label, mode, nanRep)
    891                     # given a string for a DF with Index

    892                     index_label = [index_label]
--> 893                 csvout.writerow(list(index_label) + list(cols))
    894             else:
    895                 csvout.writerow(cols)

UnicodeEncodeError: 'ascii' codec can't encode character u'\u03c3' in position 2: ordinal not in range(128)

I think this should be separate from #680. The CSV issue is also mentioned in this comment on bug #300.

Unicode

Most helpful comment

I had to rewrite this b/c it slowed down CSV reading/writing. If you want to write a UTF-8 encoded csv in python version < 3, you need to pass df.to_csv(..., encoding='utf-8').

All 4 comments

I presume you're using python version < 3? The csv module does not handle unicode unfortunately. I'll see if there is a workaround, but as you can tell by the recurring issues, pandas isn't exactly unicode-friendly on <= python 2.7, but neither is python 2.7 ...

I had to rewrite this b/c it slowed down CSV reading/writing. If you want to write a UTF-8 encoded csv in python version < 3, you need to pass df.to_csv(..., encoding='utf-8').

Yes, I'm on 2.7. Thanks @adamklein !

@adamklein 2018 and still the same issue. your trick helped. thanks!

Was this page helpful?
0 / 5 - 0 ratings