Pandas: groupby fails when elements are lists

Created on 14 Apr 2017  路  1Comment  路  Source: pandas-dev/pandas

```
import pandas as pd
df = pd.DataFrame ({'a' : [[1,2,3]]})
g = df.groupby ('a')
g.groups

TypeError: unhashable type: 'list'

Groupby Usage Question

Most helpful comment

Nested lists aren't a first-class type in pandas - they can sometimes be used, but not as groupby keys, because, as the error message says, they aren't hashable.

If possible, it's better to use a MultiIndex or some other multi-dim structure to store nested data. If necessary, you could also convert the list to a tuple, which is hashable.

In [16]: df['a'] = df['a'].apply(tuple)

In [17]: df.groupby('a').groups
Out[17]: {(1, 2, 3): Int64Index([0], dtype='int64')}

>All comments

Nested lists aren't a first-class type in pandas - they can sometimes be used, but not as groupby keys, because, as the error message says, they aren't hashable.

If possible, it's better to use a MultiIndex or some other multi-dim structure to store nested data. If necessary, you could also convert the list to a tuple, which is hashable.

In [16]: df['a'] = df['a'].apply(tuple)

In [17]: df.groupby('a').groups
Out[17]: {(1, 2, 3): Int64Index([0], dtype='int64')}
Was this page helpful?
0 / 5 - 0 ratings