Pandas: groupby agg with rank and parameter return does not reduce

Created on 25 Nov 2016  路  2Comments  路  Source: pandas-dev/pandas

Sample with Multiindex in columns:

df = pd.DataFrame({('r', 'c'): {(1, '2016-11-01 00:00:00+00:00', 3121): 143, (1, '2016-11-01 00:00:00+00:00', 4880): 12, (1, '2016-11-01 00:00:00+00:00', 3953): 4, (1, '2016-11-01 00:00:00+00:00', 3923): 11}})
df.index.names = ['z','x','y']  
print (df)
                                    r
                                    c
z x                         y        
1 2016-11-01 00:00:00+00:00 3121  143
                            3923   11
                            3953    4
                            4880   12

x = 'x'
y = 'y'

#works perfect
print (df.groupby(level=[x, y]).agg({('r', 'c'): 'rank'}))

print (df.groupby(level=[x, y]).agg({('r', 'c'): lambda x: x.rank(ascending=False)}))
#ValueError: Function does not reduce

Problem is with non MultiIndex columns also:

df = pd.DataFrame({'A':[1,1,3,3],
                   'B':[4,5,6,1]})

print (df)
   A  B
0  1  4
1  1  5
2  3  6
3  3  1

print (df.groupby('A').agg({'B': 'rank'}))
     B
0  1.0
1  2.0
2  2.0
3  1.0
print (df.groupby('A').agg({'B': lambda x: x.rank()}))
#Exception: Must produce aggregated value
print (df.groupby('A').agg({'B': lambda x: x.rank(ascending=False)}))
#Exception: Must produce aggregated value

Problem is how can I use function with parameter in agg function? SO question

print (pd.show_versions())
INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: sk_SK
LOCALE: None.None

pandas: 0.19.1
nose: 1.3.7
pip: 8.1.1
setuptools: 20.3
Cython: 0.23.4
numpy: 1.11.0
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 4.1.2
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.5.1
pytz: 2016.2
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5.2
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.4.1
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: 0.2.1
None

Thank you for pandas and for your perfect documentation.

Apply Bug Groupby

Most helpful comment

actually, going to reopen this one. The other issue is fixed I think. This is slightly different.

All 2 comments

In [13]: df.groupby(level=[x,y]).rank(ascending=False)
Out[13]: 
                                    r
                                    c
z x                         y        
1 2016-11-01 00:00:00+00:00 3121  1.0
                            3923  1.0
                            3953  1.0
                            4880  1.0

In [14]: df.groupby(level=[x,y]).transform(lambda x: x.rank(ascending=False))
Out[14]: 
                                    r
                                    c
z x                         y        
1 2016-11-01 00:00:00+00:00 3121  1.0
                            3923  1.0
                            3953  1.0
                            4880  1.0

you need to use .transform. .agg is by definition a reducer.

this is duplicate of #11759.

I think there is bug somewhere in there. if you want to dig would be appreciated.

actually, going to reopen this one. The other issue is fixed I think. This is slightly different.

Was this page helpful?
0 / 5 - 0 ratings