Home > Enterprise >  Collapse rows of dataframe into matricies
Collapse rows of dataframe into matricies

Time:01-18

I have a csv that produces a 60000 x 785 dataframe. Each row has a number as the first column and the rest of the 784 columns are pixel values for that number. I need to collapse the 784 pixels into 28 x 28 matrices. The resulting dataframe will have the digit in the first column and the second column will be the 28x28 matrix of pixel values.

digit p1 p2 p3 p4 ... p785

I've tried reshaping the data but that failed, so what should I do to reshape it?

CodePudding user response:

first set 'digit' to index, then try this:

df.apply(lambda x: x.values.reshape(28, -1), axis=1)

CodePudding user response:

You can try to store the data as a dictionary with keys are digits in the 1st column and values are arrays of size 28x28 containing pixel values of each row.

  • Import packages:
import csv
import numpy as np
import pandas as pd
  • Create a test.csv file:
with open('./test.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.writer(f)
    for i in range(60000):
        row_i = (i 1)*np.ones(785, dtype=int)
        writer.writerow(row_i)
f.close
  • Load the csv as a dataframe:
data = pd.read_csv('./test.csv', header=None, index_col=0)
data = data.T.to_dict('list')
for i, value in data.items():
    data[i] = np.asarray(value).reshape(28,28)
  • Check the output
print(len(data))
print(data)
  • Related