import pandas as pd
df = pd.DataFrame({'a': [1,2,3,4], 'b': [5,6,7,8]} )
df = df.drop(1)
df.to_feather("test.feather")
I saw documentation in the code requiring default index at: https://github.com/pandas-dev/pandas/blob/794be8c7ab6897b7206f2c6ec60d22fea2e440a3/pandas/io/feather_format.py#L46-L52
However, I think it is unintuitive to require the user to have a default index before writing to feather (for example this is not a requirement to writing to csv). Why is this a requirement? What are your thoughts about reindexing being the default?
pd.show_versions()commit : None
python : 3.6.9.final.0
python-bits : 64
OS : Darwin
OS-release : 17.7.0
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 0.25.0
numpy : 1.17.0
pytz : 2019.2
dateutil : 2.8.0
pip : 19.2.2
setuptools : 41.0.1
Cython : None
pytest : 5.1.0
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 3.8.0
html5lib : 1.0.1
pymysql : None
psycopg2 : 2.8.3 (dt dec pq3 ext lo64)
jinja2 : 2.8.1
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 3.8.0
matplotlib : 3.1.1
numexpr : 2.7.0
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.14.1
pytables : None
s3fs : None
scipy : 1.3.1
sqlalchemy : 1.3.7
tables : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None
feather (which is just pyarrow under the hood) does not save the index, so round-tripping fails with a non-default index. this is a design decision of the format to make it simpler. parquet does allow saving of the index.
Got it, thanks for the info
Most helpful comment
feather (which is just pyarrow under the hood) does not save the index, so round-tripping fails with a non-default index. this is a design decision of the format to make it simpler. parquet does allow saving of the index.