I have two dataframes defined as:
dataframe1 = pd.DataFrame( [[123, "ABC", "DEF", "GHI"],
[123, "ABC", "DEF", "GHI"],
[123, "ABC", "DEF", "GHI"],
[456, "JKL", "MNO", "PQR"],
[456, "JKL", "MNO", "PQR"],
[456, "JKL", "MNO", "PQR"]],
columns=["ID",
"Attr1", "Attr2", "Attr3"],
)
dataframe2 = pd.DataFrame( [["1", "A", 123],
["2", "A", 123],
["1", "C", 456],
["1", "B", 456],
["2", "B", 123],
["2", "C", 456]],
columns=["ROW","LOCATION","ID"],
)
I want to merge these two dataframe based on ID number. I have tried merge function as
pd.merge(dataframe1,dataframe2, on = 'ID')
However, that does not generate my desired result. I want an output like
ID ROW LOCATION Attr1 Attr2 Attr3
0 123 1 A ABC DEF GHI
1 123 2 A ABC DEF GHI
2 123 2 B ABC DEF GHI
3 456 1 B JKL MNO PQR
4 456 1 C JKL MNO PQR
5 456 2 C JKL MNO PQR
CodePudding user response:
Is this what you are trying to get?
df_merge = pd.merge(dataframe1, dataframe2, on = ['ID']).drop_duplicates(ignore_index = True)
df_merge[['ID', 'ROW', 'LOCATION', 'Attr1', 'Attr2', 'Attr3']]
CodePudding user response:
To get your desired result, you can sort each dataframe (both on ID
and dataframe2
on LOCATION
and ROW
as well) and then use concat
to join them horizontally on their indexes, dropping the ID
column from dataframe1
:
df1 = dataframe1.sort_values(['ID']).reset_index(drop=True)
df2 = dataframe2.sort_values(['ID', 'LOCATION', 'ROW']).reset_index(drop=True)
pd.concat([df2,df1.drop('ID', axis=1)], axis=1)
Output:
ROW LOCATION ID Attr1 Attr2 Attr3
0 1 A 123 ABC DEF GHI
1 2 A 123 ABC DEF GHI
2 2 B 123 ABC DEF GHI
3 1 B 456 JKL MNO PQR
4 1 C 456 JKL MNO PQR
5 2 C 456 JKL MNO PQR
CodePudding user response:
Although the exact logic is unclear, it looks like you want a merge in order of the duplicates.
For this you can add a secondary key:
(dataframe1
.assign(ID2=lambda d: d.groupby('ID').cumcount())
.merge(dataframe2.assign(ID2=lambda d: d.groupby('ID').cumcount()),
on=['ID', 'ID2'])
)
Output:
ID Attr1 Attr2 Attr3 ID2 ROW LOCATION
0 123 ABC DEF GHI 0 1 A
1 123 ABC DEF GHI 1 2 A
2 123 ABC DEF GHI 2 2 B
3 456 JKL MNO PQR 0 1 C
4 456 JKL MNO PQR 1 1 B
5 456 JKL MNO PQR 2 2 C