Pandas: from_dict / to_dict should accept same orient parameter

Created on 26 Dec 2015 · 7Comments · Source: pandas-dev/pandas

Hello,

pd.DataFrame.to_dict accepts orient parameter in
['dict', 'list', 'series', 'split', 'records', 'index'] (default being 'dict')

http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.DataFrame.to_dict.html

from_dict only accepts orient parameter in ['columns', 'index'] (default being 'columns')
I think that from_dict should accept same parameter orient for API consistency

http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.DataFrame.from_dict.html

Same for Panel
http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.Panel.from_dict.html
to_dict: missing method for Panel

Same for Series
http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.Series.to_dict.html
from_dict: missing method for Series

Kind regards

Source

femtotrader

👍4

Most helpful comment

what you are actually asking

I guess he is referring to the problem that to_dict has more modes than from_dict, which means that some to_dict conversions don't have an inverse operation.

It's an API inconsistency, and a consistent API is a "good thing". Here is an example of a problem that arises from this inconsistency:

When converting a DataFrame to JSON we are using the orient="split" option, because it seems to be only option that preserves a DataFrame entirely (order of columns and index). In the inverse operation we would need the equivalent. We cannot use to_json/read_json directly (which both support orient="split"), because the DataFrame is only a nested element in a bigger JSON structure, which requires operating on dicts. But Pandas lacks the inverse operation for dicts... We tried to temporarily convert the dict to string, allowing to use the read_json on the string with the orient split option, but the double conversion makes it prohibitively slow.

bluenote10 on 14 Aug 2019

👍9

All 7 comments

you would have to show a specific example here of why this API change would actually be useful. Don't just list lots of code references.

jreback on 26 Dec 2015

Use case is to be able to save to MongoDB Series, DataFrame, Panel and also retrieve them.

An other feature to add to to_dict will be to output dict of NumPy arrays instead of dict of Python lists.

femtotrader on 26 Dec 2015

pls show a specific example
and a use case - you are describing a very general case

jreback on 26 Dec 2015

https://bitbucket.org/djcbeach/monary/issues/19/use-pandas-series-dataframe-and-panel-with

femtotrader on 26 Dec 2015

still not sure WHY this would be a good thing to add to the API nor what you are actually asking

jreback on 26 Dec 2015

what you are actually asking

I guess he is referring to the problem that to_dict has more modes than from_dict, which means that some to_dict conversions don't have an inverse operation.

It's an API inconsistency, and a consistent API is a "good thing". Here is an example of a problem that arises from this inconsistency:

bluenote10 on 14 Aug 2019

👍9

what you are actually asking

I guess he is referring to the problem that to_dict has more modes than from_dict, which means that some to_dict conversions don't have an inverse operation.

It's an API inconsistency, and a consistent API is a "good thing". Here is an example of a problem that arises from this inconsistency:

When converting a DataFrame to JSON we are using the orient="split" option, because it seems to be only option that preserves a DataFrame entirely (order of columns and index). In the inverse operation we would need the equivalent. We cannot use to_json/read_json directly (which both support orient="split"), because the DataFrame is only a nested element in a bigger JSON structure, which requires operating on dicts. But Pandas lacks the inverse operation for dicts... We tried to temporarily convert the dict to string, allowing to use the read_json on the string with the orient split option, but the double conversion makes it prohibitively slow.