"Merging" Same sized dataframes into one dataframe-CodePudding

I have a bunch of Dataframes following this kind of pattern:

col1  col2  col3   
1     2     3  
1     2     3  
1     2     3 

col1  col2  col3   
1     2     3  
1     2     3  
1     2     3

how do I merge them into

col1  col2  col3   
[1,1] [2,2] [3,3] 
[1,1] [2,2] [3,3]
[1,1] [2,2] [3,3]

I have no idea how to do this, just feels like there should be an easy way.

CodePudding user response：

If your dataframe are well aligned, you can use numpy.dstack

import numpy as np

out = pd.DataFrame(np.dstack([df1, df2]).tolist(),
                   index=df1.index, columns=df1.columns)
print(out)

# Output
     col1    col2    col3
0  [1, 1]  [2, 2]  [3, 3]
1  [1, 1]  [2, 2]  [3, 3]
2  [1, 1]  [2, 2]  [3, 3]

Update

Using only pandas:

out = pd.concat([df1, df2]).stack().groupby(level=[0, 1]) \
        .apply(list).unstack(level=1)
print(out)

# Output
     col1    col2    col3
0  [1, 1]  [2, 2]  [3, 3]
1  [1, 1]  [2, 2]  [3, 3]
2  [1, 1]  [2, 2]  [3, 3]

CodePudding user response：

Try this

import pandas as pd
df1 = pd.DataFrame([[10, 20, 30], [10, 20, 30], [10, 20, 30]])
df2 = pd.DataFrame([[11, 12, 13], [11, 12, 13], [11, 12, 13]])
df1.applymap(lambda x: [x])   df2.applymap(lambda x: [x])

→

          0         1         2
0  [10, 11]  [20, 12]  [30, 13]
1  [10, 11]  [20, 12]  [30, 13]
2  [10, 11]  [20, 12]  [30, 13]

Explanation: lambda x: [x] is a function which converts every argument x in a list of length 1 containing exactly that argument.

.applymap applies this function to every cell in the data frame.

(the sum operator) is "overloaded" for pandas data frames. In particular, the sum f1 f2 of two frames (of equal shape) is defined as a new frame containing in each cell the sum of the corresponding cells of the operands (f1 and f2).

This is trivial if the cells contains numbers. But this also works for other data types: In Python lists can be concatenated via the sum operator: [1, 2] [50, 60] → [1, 2, 50, 60].