I have this innocent looking data:
datetime,power
2022-02-14 15:09:58.163,114.07
2022-02-14 15:09:58.657,113.63
2022-02-14 15:10:32.237,114.28
2022-02-14 15:10:32.730,113.75
2022-02-14 15:10:33.195,113.76
2022-02-14 15:10:33.680,113.83
2022-02-14 15:10:34.195,114.44
2022-02-14 15:10:34.679,115.09
the following code produces NaNs
_df = pd.read_csv('measurements/nan_df.csv')
_df['datetime'] = pd.to_datetime(_df['datetime'])
_df.set_index('datetime', inplace=True)
_df.resample('s').mean()
datetime,power
2022-02-14 15:09:58,113.85
2022-02-14 15:09:59,
2022-02-14 15:10:00,
2022-02-14 15:10:01,
2022-02-14 15:10:02,
2022-02-14 15:10:03,
2022-02-14 15:10:04,
2022-02-14 15:10:05,
2022-02-14 15:10:06,
2022-02-14 15:10:07,
2022-02-14 15:10:08,
2022-02-14 15:10:09,
Any idea why?
CodePudding user response:
It is expected, because pandas by default create consecutive DatetimeIndex by DataFrame.resample
, if not exist values is added NaN
s.
If need remove values with misisng value use:
df = df.resample('s').mean().dropna()
print (df)
power
datetime
2022-02-14 15:09:58 113.850
2022-02-14 15:10:32 114.015
2022-02-14 15:10:33 113.795
2022-02-14 15:10:34 114.765