Home > Net >  pandas NaN after resample
pandas NaN after resample

Time:05-24

I have this innocent looking data:

datetime,power
2022-02-14 15:09:58.163,114.07
2022-02-14 15:09:58.657,113.63
2022-02-14 15:10:32.237,114.28
2022-02-14 15:10:32.730,113.75
2022-02-14 15:10:33.195,113.76
2022-02-14 15:10:33.680,113.83
2022-02-14 15:10:34.195,114.44
2022-02-14 15:10:34.679,115.09

the following code produces NaNs

_df = pd.read_csv('measurements/nan_df.csv')
_df['datetime'] = pd.to_datetime(_df['datetime'])
_df.set_index('datetime', inplace=True)
_df.resample('s').mean()
datetime,power
2022-02-14 15:09:58,113.85
2022-02-14 15:09:59,
2022-02-14 15:10:00,
2022-02-14 15:10:01,
2022-02-14 15:10:02,
2022-02-14 15:10:03,
2022-02-14 15:10:04,
2022-02-14 15:10:05,
2022-02-14 15:10:06,
2022-02-14 15:10:07,
2022-02-14 15:10:08,
2022-02-14 15:10:09,

Any idea why?

CodePudding user response:

It is expected, because pandas by default create consecutive DatetimeIndex by DataFrame.resample, if not exist values is added NaNs.

If need remove values with misisng value use:

df = df.resample('s').mean().dropna()
print (df)
                       power
datetime                    
2022-02-14 15:09:58  113.850
2022-02-14 15:10:32  114.015
2022-02-14 15:10:33  113.795
2022-02-14 15:10:34  114.765
  • Related