I have 2 dataframes like these:
df1 = pd.DataFrame(data = {'col1' : ['finance', 'accounting'], 'col2' : ['f1', 'a1']})
df2 = pd.DataFrame(data = {'col1' : ['finance', 'finance', 'finance', 'accounting', 'accounting','IT','IT'], 'col2' : ['f1','f2','f3','a1,'a2','I1','I2']})
df1
col1 col2
0 finance f1
1 accounting a1
df2
col1 col2
0 finance f1
1 finance f2
2 finance f3
3 accounting a1
4 accounting a2
5 IT I1
6 IT I2
I would like to do LEFT JOIN on col1 and ANTI-JOIN on col2. The output should look like this:
col1 col2
finance f2
finance f3
accounting a2
Could someone please help me how to do it properly in pandas. I tried both join
and merge
in pandas but it hasn't worked for me. Thanks in advance.
CodePudding user response:
You can merge
and filter:
(df1.merge(df2, on='col1', suffixes=('_', None))
.loc[lambda d: d['col2'] != d.pop('col2_')]
)
Output:
col1 col2
1 finance f2
2 finance f3
4 accounting a2