df = pd.DataFrame({'date': ['3/10/2000', '3/11/2000', '3/12/2000','3/13/2000','3/14/2000','3/15/2000','3/16/2000','3/17/2000','3/18/2000'],
'value': [2,NaN,NaN,NaN,NaN,NaN,NaN,NaN,25]})
In this dataframe, I want to replace the Nan values as with the following logic: In this case the difference between two dates in terms of days when the value column is not Nan is 8 days i.e. 3/18/2000 - 3/10/2000 = 8 days. And lets say the delta = 23 which we get from subtracting 25-2. I want to replace the Nan values for all the other t day as 2 (delta)*(t/8) where t is any day with a nan value between the given two non nan value My desired outcome of value column is : [2,4.875,7.75,10.625,13.5,16.375,19.25,22.125,25]
CodePudding user response:
You can set the date to timedelta
, then as index and interpolate
with the 'index' method:
df['value'] = (df
.assign(date=pd.to_datetime(df['date']))
.set_index('date')['value']
.interpolate('index')
.values
)
output:
date value
0 3/10/2000 2.000
1 3/11/2000 4.875
2 3/12/2000 7.750
3 3/13/2000 10.625
4 3/14/2000 13.500
5 3/15/2000 16.375
6 3/16/2000 19.250
7 3/17/2000 22.125
8 3/18/2000 25.000