Home > Software design >  Filling missing data by interpolation in Python
Filling missing data by interpolation in Python

Time:09-29

I have a pandas dataframe which looks like this:

           Date and Time      Seconds  Pressure (mmHg)  Temperature (C)
0    2021-05-13 13:00:00        0.000          709.719           26.551
1    2021-05-13 14:00:00     3600.001          709.364           25.966
2    2021-05-13 15:00:00     7200.001          708.698           25.331
3    2021-05-13 16:00:00    10800.001          707.689           25.184
4    2021-05-13 17:00:00    14400.001          707.206           25.184

The pressure and temperature data meant to be in 15 minutes intervals but the sensor setting was wrong and collected data ever hour. Assuming linear interpolation, how to expand data timestamp to 15-minutes intervals and fill missing data between hours with liner interpolations? I tried the solution suggested here, but my files are huge and there are many of them . This solution is not very fast.

CodePudding user response:

Use DataFrame.resample with Resampler.first for missing values between hours and then DataFrame.interpolate:

df['Date and Time'] = pd.to_datetime(df['Date and Time'])

df = (df.resample('15Min', on='Date and Time')[['Pressure (mmHg)','Temperature (C)']]
       .first()
       .interpolate())
print (df)

                     Pressure (mmHg)  Temperature (C)
Date and Time                                        
2021-05-13 13:00:00        709.71900         26.55100
2021-05-13 13:15:00        709.63025         26.40475
2021-05-13 13:30:00        709.54150         26.25850
2021-05-13 13:45:00        709.45275         26.11225
2021-05-13 14:00:00        709.36400         25.96600
2021-05-13 14:15:00        709.19750         25.80725
2021-05-13 14:30:00        709.03100         25.64850
2021-05-13 14:45:00        708.86450         25.48975
2021-05-13 15:00:00        708.69800         25.33100
2021-05-13 15:15:00        708.44575         25.29425
2021-05-13 15:30:00        708.19350         25.25750
2021-05-13 15:45:00        707.94125         25.22075
2021-05-13 16:00:00        707.68900         25.18400
2021-05-13 16:15:00        707.56825         25.18400
2021-05-13 16:30:00        707.44750         25.18400
2021-05-13 16:45:00        707.32675         25.18400
2021-05-13 17:00:00        707.20600         25.18400
  • Related