Home > front end >  How to devide each raw of data into 3 matrixes in python?
How to devide each raw of data into 3 matrixes in python?

Time:05-13

I have data with 1034 columns, I want to divide each raw of it into 3 matrixes of 49*7. It remains 5 columns delete them. How can I do this in python?

First, I removed the last 5 columns from the data.

rawData = pd.read_csv('../input/smartgrid/data/data.csv')#import the data

         #remove the last 5 columns
            rawData.pop('2016/9/9')
            rawData.pop('2016/9/8')
            rawData.pop('2016/9/7')
            rawData.pop('2016/9/6')
            rawData.pop('2016/9/5')            

Then, It happens a preprocessing of the data. After that, it is fed to this function which is supposed to divide each row into three matrixes week1, week2 and week3.

def CNN2D(X_train, X_test, y_train, y_test):
    print('2D - Convolutional Neural Network:')
 #Transforming every row of the train set into a 2D array
            n_array_X_train = X_train.to_numpy()
    #devided n_array_Xtrain into 3 matrixes in order to apply it in convolution layer like RGB color
           week1= [] # the first matrix
           week2= [] # the second matrix
           week3= [] # the third matrix

CodePudding user response:

Here's a way to do what you're asking:

import pandas as pd
import numpy as np
#rawData = pd.read_csv('../input/smartgrid/data/data.csv')#import the data
rawData = pd.DataFrame([[x * 5   i for x in range(1034)] for i in range(2)], columns=range(1034))

numRowsPerMatrix = len(rawData.columns) // 7 // 3
numColsNeeded = 3 * 7 * numRowsPerMatrix
rawData = rawData.T.iloc[:numColsNeeded].T

for i in range(len(rawData.index)):
    n_array_X_train = rawData.iloc[i].to_numpy()
    week1= np.reshape(n_array_X_train[:49 * 7], (49, 7)) # the first matrix
    week2= np.reshape(n_array_X_train[49 * 7: 2 * 49 * 7], (49, 7)) # the second matrix
    week3= np.reshape(n_array_X_train[2 * 49 * 7:], (49, 7)) # the third matrix

The line rawData = rawData.T.iloc[:numColsNeeded].T transposes the array, slices only the required rows (which were columns in the original df, all but last 5), then transposes it back.

The assignments to week1, week2 and week3 slice successive thirds of the 1D numpy array in the current row of rawData and reshape each into a 49 row by 7 column matrix.

  • Related