How to use pandas resample using 'day of year' data (Python)-CodePudding

My pandas array looks like this...

     DOY Value
0      5  5118
1     10  5098
2     15  5153

I've been trying to resample my data and fill in the gaps using pandas resample function. My worry is that since I'm trying to resample without using direct datetime values, I won't be able to resample my data.

My attempt to solve this was using the following line of code but got an error saying I was using Range Index. Perhaps I need to use Period Index somehow, but I'm not sure how to go about it.

inter.resample('1D').mean().interpolate()

Here's my intended result

     DOY Value
0      5  5118
1      6  5114
2      7  5110
3      8  5106
4      9  5102
5     10  5098
:      :    :
10    15  5153

CodePudding user response：

Convert to_datetime, perform the resample and then drop the unwanted column:

df["date"] = pd.to_datetime(df["DOY"].astype(str),format="%j")
output = df.resample("D", on="date").last().drop("date", axis=1).interpolate().reset_index(drop=True)

>>> output
     DOY   Value
0    5.0  5118.0
1    6.0  5114.0
2    7.0  5110.0
3    8.0  5106.0
4    9.0  5102.0
5   10.0  5098.0
6   11.0  5109.0
7   12.0  5120.0
8   13.0  5131.0
9   14.0  5142.0
10  15.0  5153.0

CodePudding user response：

pd.DataFrame.interpolate works on the index. So let's start with setting an appropriate index and then a new one over which we will interpolate.

d0 = df.set_index('DOY')
idx = pd.RangeIndex(d0.index.min(), d0.index.max() 1, name='DOY')

d0.reindex(idx).interpolate().reset_index()

      DOY   Value
0       5  5118.0
1       6  5114.0
2       7  5110.0
3       8  5106.0
4       9  5102.0
5      10  5098.0
6      11  5109.0
7      12  5120.0
8      13  5131.0
9      14  5142.0
10     15  5153.0