I need help with getting the average of the previous X amount of days into the current position of the new column.
The problem I am having is at the line of code df['avg'] = (df['Close'].shift(0) df['Close'].shift(1)) / 2
.
This is what I want, but of course, I want it to be dynamic. That is where I need help! I can't figure out how to do so because I am having issues with how it already seems to by looping itself when called.
I understand what it is doing and why (...I think) but can't figure out a way around it to get my desired result.
import pandas as pd
import os
import sys
import NasdaqTickerSymbols as nts
class MY_PANDA_INDICATORS():
def __init__(self, days, csvFile):
self.days = days
self.df = None
self.csvFile = csvFile
def GetDataFrame(self):
modpath = os.path.dirname(os.path.abspath(sys.argv[0]))
datapath = os.path.join(modpath, "CSV\\" self.csvFile ".csv")
df = pd.read_csv(datapath)
return(df)
def GetEMA(self):
df['avg'] = df['Close'].shift(0) df['Close'].shift(1)
return(df)
myD = MY_PANDA_INDICATORS(2,nts.matches[0])
print(myD.GetEMA())
Here is what I am getting and also what I want, but I want to be able to change the number of days and get the average of that "x" amount I pass to it. I have tried looping but none work as intended.
Date Open High Low Close Adj Close Volume avg
0 2020-11-16 1.15 1.15 1.11 1.12 1.12 17100 NaN
1 2020-11-17 1.15 1.15 1.11 1.13 1.13 29900 1.125
2 2020-11-18 1.15 1.20 1.12 1.16 1.16 127700 1.145
3 2020-11-19 1.17 1.22 1.16 1.16 1.16 64500 1.160
4 2020-11-20 1.18 1.18 1.14 1.15 1.15 32600 1.155
.. ... ... ... ... ... ... ... ...
246 2021-11-08 2.40 2.40 2.31 2.32 2.32 20000 2.340
247 2021-11-09 2.35 2.35 2.28 2.31 2.31 19700 2.315
248 2021-11-10 2.29 2.31 2.20 2.20 2.20 24200 2.255
249 2021-11-11 2.20 2.22 2.18 2.21 2.21 18700 2.205
250 2021-11-12 2.21 2.22 2.18 2.21 2.21 7800 2.210
CodePudding user response:
You can reindex your DataFrame by the date, and then perform a rolling mean and with the argument x number of days as a string (such as "2D"
):
df['avg'] = df.set_index(["Date"]).rolling(f"{self.days}D").mean().values
On a smaller example:
df = pd.DataFrame({'date': pd.date_range('2021-01-01','2021-01-05'), 'close':[1,3,5,7,9]})
Input:
>>> df
date close
0 2021-01-01 1
1 2021-01-02 3
2 2021-01-03 5
3 2021-01-04 7
4 2021-01-05 9
df['avg'] = df.set_index(["date"]).rolling("2D").mean().values
Output:
>>> df
date close avg
0 2021-01-01 1 1.0
1 2021-01-02 3 2.0
2 2021-01-03 5 4.0
3 2021-01-04 7 6.0
4 2021-01-05 9 8.0