Home > other >  Build a numpy array from the rows of two dataframe
Build a numpy array from the rows of two dataframe

Time:03-12

I have two DataFrames, I want to choose each rows of each one and stick them together to build an array.

import pandas as pd
df = pd.DataFrame()
df ['a'] = [1, 2, 3]
df ['b'] = [4, 7, 1]

df1 = pd.DataFrame()
df1 ['a'] = [3, 7, 8]
df1 ['b'] = [9, 2, 1]

for example, I want to choose the row 1 from two data frame and build an array as:

array([[1, 3],
   [4, 9]]) 

or for row 3, the output should be:

array([[3, 8],
   [1, 1]])

This is in a loop, and each time I need the array of each row.

CodePudding user response:

You can use a simple approach as follow, the 2,2 size of the test is changed based on your df.

import numpy as np
test = np.zeros((2, 2))
test [0,:] = df.iloc[0,:]
test [1,:] = df1.iloc[0,:]
test = test.T

With a for loop, you can change the index of the df.iloc[i,:].

CodePudding user response:

Defining a function would be more helpful in your case:

import numpy as np
def getRowArray(df1=df, df2=df1, row_number=1):
  array1 = df1.iloc[row_number 1].to_numpy().reshape(-1,1)
  array2 = df2.iloc[row_number 1].to_numpy().reshape(-1,1)
  return(np.concatenate([array1, array2], axis=1))

If you call, for example, getRowArray(row_number=1), it will result in:

array([[1, 3],
       [4, 9]])

Note that, the index in a dataframe starts from zero, but this function is based on the example you provided (the first index in this function is 1 and not zero)

Another example:

getRowArray(row_number=3)

Result:

array([[3, 8],
       [1, 1]])

CodePudding user response:

You could concat to_numpy to get a single array of size len(df) x (df.shape[1] df1.shape[1]). Then reshape into a 3D array and transpose to the desired shape:

out = pd.concat((df, df1), axis=1).to_numpy().reshape(*df.shape, 2).transpose(0,2,1)

Output:

array([[[1, 3],
        [4, 9]],

       [[2, 7],
        [7, 2]],

       [[3, 8],
        [1, 1]]], dtype=int64)
  • Related