Pandas: BUG: values are dropped when grouping on a variables with NaNs

Created on 2 Dec 2020  路  3Comments  路  Source: pandas-dev/pandas

Hello there,

Consider this simple example

pd.__version__
Out[158]: '1.1.3'

df = pd.DataFrame({'col': [1,2,3],
                   'group' : ['a',np.NaN,'b']})

df
Out[160]: 
   col group
0    1     a
1    2   NaN
2    3     b

df.groupby('group').apply(lambda x: x)
Out[161]: 
   col group
0  1.0     a
1  NaN   NaN
2  3.0     b

There are a few things that are puzzling.

  1. I thought groupby would drop the NA groups by default
  2. Why is the variable col set to missing?

This looks like a bug to me, unless I am missing something (apologies if this is the case)
What do you think?

Thanks!

Apply Bug Groupby

Most helpful comment

pls try this in master

All 3 comments

pls try this in master

1.1.4 ?

Persists on master

Was this page helpful?
0 / 5 - 0 ratings