Home > OS >  Pandas resample without dropping nan values
Pandas resample without dropping nan values

Time:09-26

I have a daily measured dataset with datetime index. I am trying to resample it to monthly by only taking the data of the first available day of month.

dataframe df:

2010-10-04 Nan 4
2010-10-05 3   5
2010-10-06 5   2

I tried using

df.resample("MS").first()

But that ends up giving me

2010-10-01 3   4

instead of

2010-10-04 Nan   4

Can I avoid droping Nan values? I couldnt find a suitable parameter in the documentation.

CodePudding user response:

IIUC, you need the first row for each month in your dataset. I tried elaborating on your example by adding more months and from different years.

Try grouping by the month and take the head(1) for each month, assuming the data is sorted by dates.

#               A    B
# 2010-10-04  NaN  4.0  <----
# 2010-10-05  3.0  5.0
# 2010-10-06  5.0  2.0
# 2010-09-05  NaN  NaN  <----
# 2010-09-05  3.0  5.0
# 2010-09-06  5.0  2.0
# 2019-10-04  7.0  7.0  <----
# 2019-10-05  3.0  5.0
# 2019-10-06  5.0  2.0

df.groupby(pd.Grouper(freq="M")).head(1)
              A    B
2010-09-05  NaN  NaN
2010-10-04  NaN  4.0
2019-10-04  7.0  7.0
  • Related