Hi i'm new to machine learning and i'm trying to understand the following code can someone explain to me what is this code doing?
training_set = dataset_train.iloc[:,1:2].values
#print(training_set)
#feature scaling
from sklearn.preprocessing import MinMaxScaler
sc = MinMaxScaler(feature_range=(0,1))
training_set_scaled = sc.fit_transform(training_set)
Train_cap=(int(0.7*len(dataset_train) 60))
#creating a data structure with 60 timesteps and 1 output
X_train = []
y_train = []
for i in range(60,Train_cap):
X_train.append(training_set_scaled[i-60:i,0])
y_train.append(training_set_scaled[i,0])
X_train,y_train = np.array(X_train),np.array(y_train)
#reshaping
X_train = np.reshape(X_train,(X_train.shape[0],X_train.shape[1],1))
especially this line:
X_train = np.reshape(X_train,(X_train.shape[0],X_train.shape[1],1))
apologies in advance if i'm asking a silly question or not in the proper form, let me know how to improve.
CodePudding user response:
Basically, this code standardizes the range of the values to be between 0 and 1 and divides the data to the parameters and their result.
This line X_train = np.reshape(X_train,(X_train.shape[0],X_train.shape[1],1))
is reshaping the training parameters to have 3 dimensions, something like that:
>>> import numpy as np
>>> data = np.zeros((2,3))
>>> data
array([[0., 0., 0.],
[0., 0., 0.]])
>>> x=np.reshape(data, (data.shape[0], data.shape[1], 1))
>>> x
array([[[0.],
[0.],
[0.]],
[[0.],
[0.],
[0.]]])
You should learn multiple dimensional arrays, it will help you to understand the code. Looking at the documentation of the functions you don't understand can also help.