Home > Mobile >  How to combine two Pandas dataframes into a single one across the axis=2 (ie. so that the cell value
How to combine two Pandas dataframes into a single one across the axis=2 (ie. so that the cell value

Time:10-01

I have two (large) dataframes. They have the same index & columns, and I want to combine them so that they have tuple values in each cell.

The example explains it best:

pd.DataFrame({
   'A':[True, True, False],
   'B':[False, True, False], 
})

df2 = pd.DataFrame({
   'A':[1, 2, 3],
   'B':[5, 6, 7], 
})

# Desired output:

pd.DataFrame({
   'A':[(True, 1), (True, 2), (False, 3)],
   'B':[(False, 5), (True, 6), (False, 7)], 
})

The DataFrames are large (1m rows ), so looking to do this somewhat efficiently.

I tried np.stack([df1.values, df2.values], axis=2) and that got me the right value array, but I could not convert it into a dataframe.

Any ideas?

CodePudding user response:

I got your desired output with this solution

import pandas as pd

df1 = pd.DataFrame({
    'A':[True, True, False],
    'B':[False, True, False], 
})

df2 = pd.DataFrame({
    'A':[1, 2, 3],
    'B':[5, 6, 7], 
})

for df_1k, df_2k in zip(df1.columns, df2.columns):
    df1[df_1k] = list(map(tuple, zip(df1[df_1k], df2[df_2k])))

print(df1)
  • Related