I have a pandas dataframe which looks like this:
Date and Time Seconds Pressure (mmHg) Temperature (C)
0 2021-05-13 13:00:00 0.000 709.719 26.551
1 2021-05-13 14:00:00 3600.001 709.364 25.966
2 2021-05-13 15:00:00 7200.001 708.698 25.331
3 2021-05-13 16:00:00 10800.001 707.689 25.184
4 2021-05-13 17:00:00 14400.001 707.206 25.184
The pressure and temperature data meant to be in 15 minutes intervals but the sensor setting was wrong and collected data ever hour. Assuming linear interpolation, how to expand data timestamp to 15-minutes intervals and fill missing data between hours with liner interpolations? I tried the solution suggested here, but my files are huge and there are many of them . This solution is not very fast.
CodePudding user response:
Use DataFrame.resample
with Resampler.first
for missing values between hours and then DataFrame.interpolate
:
df['Date and Time'] = pd.to_datetime(df['Date and Time'])
df = (df.resample('15Min', on='Date and Time')[['Pressure (mmHg)','Temperature (C)']]
.first()
.interpolate())
print (df)
Pressure (mmHg) Temperature (C)
Date and Time
2021-05-13 13:00:00 709.71900 26.55100
2021-05-13 13:15:00 709.63025 26.40475
2021-05-13 13:30:00 709.54150 26.25850
2021-05-13 13:45:00 709.45275 26.11225
2021-05-13 14:00:00 709.36400 25.96600
2021-05-13 14:15:00 709.19750 25.80725
2021-05-13 14:30:00 709.03100 25.64850
2021-05-13 14:45:00 708.86450 25.48975
2021-05-13 15:00:00 708.69800 25.33100
2021-05-13 15:15:00 708.44575 25.29425
2021-05-13 15:30:00 708.19350 25.25750
2021-05-13 15:45:00 707.94125 25.22075
2021-05-13 16:00:00 707.68900 25.18400
2021-05-13 16:15:00 707.56825 25.18400
2021-05-13 16:30:00 707.44750 25.18400
2021-05-13 16:45:00 707.32675 25.18400
2021-05-13 17:00:00 707.20600 25.18400