I am self-learning ML & DS. I am getting stuck while trying to split the DaaFrame (dfc). The following error and the various posts on this site suggest that this error is due to the non-conversion of the DataFrame into an integer. However as much as I know & understand, I have done this step ("split = int(0.80*len(dfc))").
Appreciate if someone can point me in the right direction.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use("seaborn-v0_8")
import warnings
warnings.filterwarnings("ignore")
import yfinance as yf
import ta
df = yf.download("GOOG")
df = df[["Adj Close"]]
df.columns = ["close"]
df = df.sort_index(ascending=False)
df["returns"] = df['close'].pct_change(1)
df["SMA 15"] = df[["close"]].rolling(15).mean().shift(1)
df["SMA 60"] = df[["close"]].rolling(60).mean().shift(1)
df["MSD 15"] = df[["returns"]].rolling(15).std().shift(1)
df["MSD 60"] = df[["returns"]].rolling(60).std().shift(1)
RSI = ta.momentum.RSIIndicator(df["close"], window=14, fillna=False)
df["rsi"] = RSI.rsi()
df["rsi"].loc["2010"].plot(figsize=(15,8))
dfc =df.columns
Percentage of Train set
split = int(0.80*len(dfc))
# Train set creation
X_train = dfc[['SMA 15', 'SMA 60', 'MSD 15', 'MSD 30', 'rsi']].iloc[:split] # Fro beginning to split
Y_train = dfc[['returns']].iloc[:split]
# Train set creation
X_test = dfc[['SMA 15', 'SMA 60', 'MSD 15', 'MSD 30', 'rsi']].iloc[split:] # Fro split to end
Y_test = dfc[['returns']].iloc[split:]
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
CodePudding user response:
A possible issue could be that you're not indexing the dataframe df
, instead trying to index the columns dfc
.
So you could try using df
when splitting into train and test like so:
X_train = df[['SMA 15', 'SMA 60', 'MSD 15', 'MSD 30', 'rsi']].iloc[:split]