Home > Mobile >  Left Join and Anti-Join on same data frames Pandas
Left Join and Anti-Join on same data frames Pandas

Time:08-30

I have 2 dataframes like these:

df1 = pd.DataFrame(data = {'col1' : ['finance',  'accounting'], 'col2' : ['f1', 'a1']}) 
df2 = pd.DataFrame(data = {'col1' : ['finance', 'finance', 'finance', 'accounting', 'accounting','IT','IT'], 'col2' : ['f1','f2','f3','a1,'a2','I1','I2']})

df1

    col1        col2
0   finance      f1
1   accounting   a1

df2

    col1        col2
0   finance      f1
1   finance      f2
2   finance      f3
3   accounting   a1 
4   accounting   a2
5   IT           I1
6   IT           I2

I would like to do LEFT JOIN on col1 and ANTI-JOIN on col2. The output should look like this:

    col1      col2
   finance     f2
   finance     f3
   accounting  a2 

Could someone please help me how to do it properly in pandas. I tried both join and merge in pandas but it hasn't worked for me. Thanks in advance.

CodePudding user response:

You can merge and filter:

(df1.merge(df2, on='col1', suffixes=('_', None))
   .loc[lambda d: d['col2'] != d.pop('col2_')]
 )

Output:

         col1 col2
1     finance   f2
2     finance   f3
4  accounting   a2
  • Related