data = [[1, 2, 2],
[1 ,2, np.nan],
[1, 2, np.nan],
[3,4,4],
[3,4, np.nan],
[3,0, np.nan]]
data = pd.DataFrame(data, columns = ["A","B", "C"])
data.groupby(["A", "B"]).ffill().reset_index()
Forward fill with groupby doesn't leave keys in the index (or at all).
mock data:
聽 A | B | C
-- | -- | --
1 | 2 | 2.0
1 | 2 | NaN
1 | 2 | NaN
3 | 4 | 4.0
3 | 4 | NaN
3 | 0 | NaN
index | C
-- | --
0 | 2.0
1 | 2.0
2 | 2.0
3 | 4.0
4 | 4.0
5 | NaN
index | A | B | C
-- | -- | -- | --
0 | 1 | 2 | 2.0
1 | 1 | 2 | 2.0
2 | 1 | 2 | 2.0
3 | 3 | 4 | 4.0
4 | 3 | 4 | 4.0
5 | 3 | 0 | NaN
Note : data.groupby(["A", "B"]).fillna(method = 'ffill').reset_index() from pandas 0.24.2 has same behaviour as pandas version 0.25.3 (missing keys).
pd.show_versions()[paste the output of pd.show_versions() here below this line]
commit : None
python : 3.7.3.final.0
python-bits : 64
OS : Linux
OS-release : 5.0.0-36-generic
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 0.25.3
numpy : 1.17.3
pytz : 2018.9
dateutil : 2.8.0
pip : 19.0.3
setuptools : 40.8.0
Cython : 0.29.6
pytest : 4.3.1
hypothesis : None
sphinx : 1.8.5
blosc : None
feather : 0.4.0
xlsxwriter : 1.1.5
lxml.etree : 4.3.2
html5lib : 1.0.1
pymysql : 0.9.3
psycopg2 : 2.8.4 (dt dec pq3 ext lo64)
jinja2 : 2.10
IPython : 7.4.0
pandas_datareader: None
bs4 : 4.7.1
bottleneck : 1.2.1
fastparquet : None
gcsfs : None
lxml.etree : 4.3.2
matplotlib : 3.0.3
numexpr : 2.6.9
odfpy : None
openpyxl : 2.6.1
pandas_gbq : None
pyarrow : 0.15.1
pytables : None
s3fs : None
scipy : 1.2.1
sqlalchemy : 1.3.1
tables : 3.5.1
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.1.5
Thanks for the report but this is by design. See https://pandas.pydata.org/pandas-docs/stable/whatsnew/v0.25.0.html#dataframe-groupby-ffill-bfill-no-longer-return-group-labels for more info
What is alternative for ffill inside groupby that returns group labels.