I have two Pandas Data Frames like below:
table1
ID
------
11
22
33
44
table2
ID1 ID2 ID3 VAL
----------------------
11 | 108 | 5 | 5555
45 | 22 | 1146 | 4567
67 | 4987 | 33 | 105
And I need to merge table1 and table2 using column "ID" from table1 and columns: "ID1", "ID2", "ID3" from table2.
I want to do like that:
- merge on ID and ID1
- if ID does not merge with ID1 --> merge ID and ID2
- if ID does not merge with ID1 and ID2 --> merge ID and ID3
- give NULL or NaN if rows does not merge
So as a result I need table like below:
ID ID1 ID2 ID3 VAL
-------------------- -----------
11 | 11 | NULL | NULL | 5555
22 | NULL | 22 | NULL | 4567
33 | NULL | NULL | 33 | 105
44 | NULL | NULL | NULL | NULL
How can i do that in Python Pandas ?
CodePudding user response:
for c in df2.columns[:-1]:
df1[c] = np.where(df1['ID'].eq(df2[c]),df1['ID'],np.nan)
df1 = df1.join(df2.iloc[:,-1])
ID | ID1 | ID2 | ID3 | VAL | |
---|---|---|---|---|---|
0 | 11 | 11 | nan | nan | 5555 |
1 | 22 | nan | 22 | nan | 4567 |
2 | 33 | nan | nan | 33 | 105 |
3 | 44 | nan | nan | nan | nan |