Python: reshape array correctly-CodePudding

I have the following function

import math
import pandas as pd
import pandas_datareader as web
import numpy as np    
import matplotlib.pyplot as plt
import os.path

from sklearn.preprocessing import MinMaxScaler

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout
from tensorflow.keras.models import load_model

def predict_stock(stock_name, predict_days=30):
    start = dt.datetime(2021, 1, 1)
    end = dt.datetime.now()

    stock = web.DataReader(stock_name, data_source="yahoo", start=start, end=end)
    stock = stock.filter(["Adj Close"])
    stock_data = stock.values

    # splits the stock into training data and test data
    training_len = math.ceil(len(stock) - predict_days)

    scale = MinMaxScaler()
    scaled_data = scale.fit_transform(stock_data)
    train_data = scaled_data[:training_len]

    # sets train values
    x_train = []
    y_train = []

    # test starts at day 60 and ends at 80 % of day end (test data)
    for i in range(predict_days, len(train_data)):
        x_train.append(train_data[i - predict_days:i])
        y_train.append(train_data[i:i predict_days])
    x_train = np.array(x_train)
    y_train = np.array(y_train)
    #y_train.reshape(y_train, x_train.shape)

predict_stock('ALB', 30)

while x_train is of shape (164, 30, 1), y_train is for some reason of shape (164,), whereby the generation was the same.

How can I reshape y_train to (164,30,1)?

I tried the command:

 y_train.reshape(y_train, x_train.shape)

but this gives me the error:

TypeError: only integer scalar arrays can be converted to a scalar index

How can I reshape the array correctly?

CodePudding user response：

The very basic flaw in your code is that you passed y_train as the first parameter of reshape (it is a method of y_train array).

But spotting this is not enough.

If you want to rsehape y_train to the shape of x_train, then y_train must have the same number of elements as x_train. You can achieve it calling e.g. np.repeat:

np.repeat(y_train, x_train.shape[1])

i.e. "multily" occurrences of source elements, but so far the result is still a 1D array.

The second step is to reshape.

So the whole code can be:

result = np.repeat(y_train, x_train.shape[1]).reshape(x_train.shape)

I intentionally saved the result in another array, in order to keep the source array for any comparison.

But consider also another approach, probably better matching the computer learning methodology:

I suppose that it is enough to convert y_train to a single column shape. So try:

result2 = np.expand_dims(y_train, 1)

This time the shape of the result is (164, 1).