I want to create a new dataframe out of 3 original dataframes I have.
3 dataframes have format: Name|col1|col2
. Names are identical from all dataframes, the differences are values from col1
and col2
.
df1:
abc 1 2
xyz 3 4
df2:
abc 5 6
xyz 7 8
df3:
abc 9 10
xyz 11 12
new dataframe should have format:
abc(df1,df2) 1 6
abc(df1,df3) 1 10
abc(df2,df3) 5 10
abc(df3,df1) 5 2
abc(df3,df1) 9 2
abc(df3,df2) 9 6
xyz(df1,df2) 3 8
xyz(df1,df3) 3 12
xyz(df2,df1) 7 4
xyz(df2,df3) 7 12
xyz(df3,df1) 11 4
xyz(df3,df2) 11 8
CodePudding user response:
This should do the trick, you may need to add another few steps to get in in the format you need but the logic to make it is all there.
df1['key'] = 'df1'
df2['key'] = 'df2'
df3['key'] = 'df3'
(pd.merge(
pd.concat([df1, df2, df3]),
pd.concat([df1, df2, df3]),
on = 0).
query('key_x != key_y').
drop(columns = ['2_x', '1_y']))
Assuming your current df columns are called 0 to 2
This will get you:
0 1_x key_x 2_y key_y
1 abc 1 df1 6 df2
2 abc 1 df1 10 df3
3 abc 5 df2 2 df1
5 abc 5 df2 10 df3
6 abc 9 df3 2 df1
7 abc 9 df3 6 df2
10 xyz 3 df1 8 df2
11 xyz 3 df1 12 df3
12 xyz 7 df2 4 df1
14 xyz 7 df2 12 df3
15 xyz 11 df3 4 df1
16 xyz 11 df3 8 df2