Home > Net >  How to join two tables by one or another ID column (python pandas)?
How to join two tables by one or another ID column (python pandas)?

Time:10-22

I am using Python pandas, I would appreciate the code in Python. But you can suggest a solution in Excel or SQL. I have two tables A and B. Table A has two ID columns by which I want to join table B. Table B has one ID column that either corresponds with one or the other ID column of Table A. Here is an example:

enter image description here

enter image description here

And this is the resulting table AB that I want: enter image description here

CodePudding user response:

in sql:

select * from tableA a
left join tableB b
on b.ID in (a.ID1, a.ID2)

CodePudding user response:

You can use pd.merge() dropna() isna() pd.concat()

import pandas as pd

m1 = pd.merge(df_A, df_B, left_on='ID Column 1', right_on='ID Column', how='left')
m1_mismatches = m1[m1['ID Column'].isna()].dropna(axis=1)
m1_matches = m1[~m1['ID Column'].isna()]

m2 = pd.merge(m1_mismatches, df_B, left_on='ID Column 2', right_on='ID Column', how='left')
df_AB = pd.concat([m1_matches, m2]).drop(columns='ID Column').reset_index(drop=True)
print(df_AB)

The idea here is to do an initial "left join" between df_A and df_B, and then a second "left join" with the mismatches found in the first join. Finally we use pd.concat to concat the results.

Output:

ID Column 1 ID Column 2 col 1 col 2 col 3 col 4
0 [email protected] [email protected] lol 9479 pee pee poo poo 985023.0
1 [email protected] [email protected] bla bla bla 123 bla bla bla hello 124323.0
2 [email protected] [email protected] harry potter 2077 NaN NaN
  • Related