My pandas array looks like this...
DOY Value
0 5 5118
1 10 5098
2 15 5153
I've been trying to resample my data and fill in the gaps using pandas resample function. My worry is that since I'm trying to resample without using direct datetime values, I won't be able to resample my data.
My attempt to solve this was using the following line of code but got an error saying I was using Range Index. Perhaps I need to use Period Index somehow, but I'm not sure how to go about it.
inter.resample('1D').mean().interpolate()
Here's my intended result
DOY Value
0 5 5118
1 6 5114
2 7 5110
3 8 5106
4 9 5102
5 10 5098
: : :
10 15 5153
CodePudding user response:
Convert to_datetime
, perform the resample
and then drop the unwanted column:
df["date"] = pd.to_datetime(df["DOY"].astype(str),format="%j")
output = df.resample("D", on="date").last().drop("date", axis=1).interpolate().reset_index(drop=True)
>>> output
DOY Value
0 5.0 5118.0
1 6.0 5114.0
2 7.0 5110.0
3 8.0 5106.0
4 9.0 5102.0
5 10.0 5098.0
6 11.0 5109.0
7 12.0 5120.0
8 13.0 5131.0
9 14.0 5142.0
10 15.0 5153.0
CodePudding user response:
pd.DataFrame.interpolate
works on the index. So let's start with setting an appropriate index and then a new one over which we will interpolate.
d0 = df.set_index('DOY')
idx = pd.RangeIndex(d0.index.min(), d0.index.max() 1, name='DOY')
d0.reindex(idx).interpolate().reset_index()
DOY Value
0 5 5118.0
1 6 5114.0
2 7 5110.0
3 8 5106.0
4 9 5102.0
5 10 5098.0
6 11 5109.0
7 12 5120.0
8 13 5131.0
9 14 5142.0
10 15 5153.0