Home > Software engineering >  Python: Rolling Minimum by date interval
Python: Rolling Minimum by date interval

Time:03-24

thanks for taking the time to read this question.

I am using time series data which is reported weekly. I am trying to calculate the minimum value of each row over 3 years which I have done using the code below. Since the data is reported weekly for each row it would be the minimum value of 156 rows (3yrs before). The column Spec_Min details the minimum value for each row over 3 years.

However, halfway through my data, it begins to be reported twice a month but I still need to have the minimum values over 3 years therefore no longer 156 rows later. I was wondering if there was a more simple way of doing this?

Perhaps doing it via date rather than rows but I am not sure how to do that.


df1['Spec_Min']=df1['Spec_NET'].rolling(156).min()
df1

Date          Spec_NET   Hed_NET   Spec_Min
1995-10-31       9.0     -13.5     -49.7
1995-11-07      11.9     -23.5     -49.7
1995-11-14       9.8     -19.4     -49.7
1995-11-21       9.7     -25.4     -49.7
1995-11-28      10.4     -20.3     -49.7
1995-12-05       1.6     -15.3     -49.7
1995-12-12     -17.0      14.2     -49.7
1995-12-19     -16.6      15.2     -49.7
1995-12-26       4.7     -15.2     -49.7
1996-01-02       5.3     -22.7     -49.7
1996-01-16       7.3     -21.0     -49.7
1996-01-23       1.3     -20.4     -49.7

CodePudding user response:

Pandas allows you to operate with a datetime aware rolling window. You'll need to structure your code to operate in terms of the number of days (365 * 3 for 3 years).

I used your provided sample DataFrame

df['Spec_Min'] = df.rolling(f'{365 * 3}D', on='Date')['Spec_NET'].min()

print(df)
         Date  Spec_NET  Hed_NET  Spec_Min
0  1995-10-31       9.0    -13.5       9.0
1  1995-11-07      11.9    -23.5       9.0
2  1995-11-14       9.8    -19.4       9.0
3  1995-11-21       9.7    -25.4       9.0
4  1995-11-28      10.4    -20.3       9.0
5  1995-12-05       1.6    -15.3       1.6
6  1995-12-12     -17.0     14.2     -17.0
7  1995-12-19     -16.6     15.2     -17.0
8  1995-12-26       4.7    -15.2     -17.0
9  1996-01-02       5.3    -22.7     -17.0
10 1996-01-16       7.3    -21.0     -17.0
11 1996-01-23       1.3    -20.4     -17.0

CodePudding user response:

Use pd.to_datetime to convert Date column to Datetime, then you could use on parameter:

df['Spec_Min'] = df.assign(Date=pd.to_datetime(df['Date'])\
                   .rolling(f'{365 * 3}D', on='Date')['Spec_NET'].min()

CodePudding user response:

Try something like this: (if your index is already a datetimeindex, skip the first two rows)

df.set_index('Date',inplace = True,drop = True)
df.index = pd.to_datetime(df.index)

# resample your dataframe in weekly frequency, and interpolate missing values 
conformed = df.resample('W-MON').mean().interpolate(method = 'nearest')

n_weeks = 3 # the length of the rolling window (in weeks)
result = conformed.rolling(n_weeks).min()

Note that, you mention that you want the minimum of each row. But it seems like you are calculating the rolling minimum of each column...

  • Related