I have data with 1034 columns, I want to divide each raw of it into 3 matrixes of 49*7. It remains 5 columns delete them. How can I do this in python?
First, I removed the last 5 columns from the data.
rawData = pd.read_csv('../input/smartgrid/data/data.csv')#import the data
#remove the last 5 columns
rawData.pop('2016/9/9')
rawData.pop('2016/9/8')
rawData.pop('2016/9/7')
rawData.pop('2016/9/6')
rawData.pop('2016/9/5')
Then, It happens a preprocessing of the data. After that, it is fed to this function which is supposed to divide each row into three matrixes week1
, week2
and week3
.
def CNN2D(X_train, X_test, y_train, y_test):
print('2D - Convolutional Neural Network:')
#Transforming every row of the train set into a 2D array
n_array_X_train = X_train.to_numpy()
#devided n_array_Xtrain into 3 matrixes in order to apply it in convolution layer like RGB color
week1= [] # the first matrix
week2= [] # the second matrix
week3= [] # the third matrix
CodePudding user response:
Here's a way to do what you're asking:
import pandas as pd
import numpy as np
#rawData = pd.read_csv('../input/smartgrid/data/data.csv')#import the data
rawData = pd.DataFrame([[x * 5 i for x in range(1034)] for i in range(2)], columns=range(1034))
numRowsPerMatrix = len(rawData.columns) // 7 // 3
numColsNeeded = 3 * 7 * numRowsPerMatrix
rawData = rawData.T.iloc[:numColsNeeded].T
for i in range(len(rawData.index)):
n_array_X_train = rawData.iloc[i].to_numpy()
week1= np.reshape(n_array_X_train[:49 * 7], (49, 7)) # the first matrix
week2= np.reshape(n_array_X_train[49 * 7: 2 * 49 * 7], (49, 7)) # the second matrix
week3= np.reshape(n_array_X_train[2 * 49 * 7:], (49, 7)) # the third matrix
The line rawData = rawData.T.iloc[:numColsNeeded].T
transposes the array, slices only the required rows (which were columns in the original df, all but last 5), then transposes it back.
The assignments to week1, week2 and week3 slice successive thirds of the 1D numpy array in the current row of rawData and reshape each into a 49 row by 7 column matrix.