Getting the average of the previous "x" amount of days into the current position of the ne-CodePudding

I need help with getting the average of the previous X amount of days into the current position of the new column.

The problem I am having is at the line of code df['avg'] = (df['Close'].shift(0) df['Close'].shift(1)) / 2.

This is what I want, but of course, I want it to be dynamic. That is where I need help! I can't figure out how to do so because I am having issues with how it already seems to by looping itself when called.

I understand what it is doing and why (...I think) but can't figure out a way around it to get my desired result.

import pandas as pd
import os
import sys
import NasdaqTickerSymbols as nts    
class MY_PANDA_INDICATORS():
    
    def __init__(self, days, csvFile):
        self.days = days
        self.df = None
        self.csvFile = csvFile


    def GetDataFrame(self):
        modpath = os.path.dirname(os.path.abspath(sys.argv[0]))
        datapath = os.path.join(modpath, "CSV\\"  self.csvFile   ".csv")
        df = pd.read_csv(datapath)
        return(df)    

    def GetEMA(self):
        df['avg'] = df['Close'].shift(0)   df['Close'].shift(1) 
        return(df)
    

myD = MY_PANDA_INDICATORS(2,nts.matches[0])
print(myD.GetEMA())

Here is what I am getting and also what I want, but I want to be able to change the number of days and get the average of that "x" amount I pass to it. I have tried looping but none work as intended.

           Date  Open  High   Low  Close  Adj Close  Volume    avg
0    2020-11-16  1.15  1.15  1.11   1.12       1.12   17100    NaN
1    2020-11-17  1.15  1.15  1.11   1.13       1.13   29900  1.125
2    2020-11-18  1.15  1.20  1.12   1.16       1.16  127700  1.145
3    2020-11-19  1.17  1.22  1.16   1.16       1.16   64500  1.160
4    2020-11-20  1.18  1.18  1.14   1.15       1.15   32600  1.155
..          ...   ...   ...   ...    ...        ...     ...    ...
246  2021-11-08  2.40  2.40  2.31   2.32       2.32   20000  2.340
247  2021-11-09  2.35  2.35  2.28   2.31       2.31   19700  2.315
248  2021-11-10  2.29  2.31  2.20   2.20       2.20   24200  2.255
249  2021-11-11  2.20  2.22  2.18   2.21       2.21   18700  2.205
250  2021-11-12  2.21  2.22  2.18   2.21       2.21    7800  2.210

CodePudding user response：

You can reindex your DataFrame by the date, and then perform a rolling mean and with the argument x number of days as a string (such as "2D"):

df['avg'] = df.set_index(["Date"]).rolling(f"{self.days}D").mean().values

On a smaller example:

df = pd.DataFrame({'date': pd.date_range('2021-01-01','2021-01-05'), 'close':[1,3,5,7,9]})

Input:

>>> df
        date  close
0 2021-01-01      1
1 2021-01-02      3
2 2021-01-03      5
3 2021-01-04      7
4 2021-01-05      9

df['avg'] = df.set_index(["date"]).rolling("2D").mean().values

Output:

>>> df
        date  close  avg
0 2021-01-01      1  1.0
1 2021-01-02      3  2.0
2 2021-01-03      5  4.0
3 2021-01-04      7  6.0
4 2021-01-05      9  8.0