I have two dataframes
df1 = pd.DataFrame({
'Date':['2013-11-24','2013-11-24','2013-11-25','2013-11-25'],
'Fruit':['Banana','Orange','Apple','Celery'],
'Num':[22.1,8.6,7.6,10.2],
'Color':['Yellow','Orange','Green','Green'],
})
print(df1)
Date Fruit Num Color
0 2013-11-24 Banana 22.1 Yellow
1 2013-11-24 Orange 8.6 Orange
2 2013-11-25 Apple 7.6 Green
3 2013-11-25 Celery 10.2 Green
df2 = pd.DataFrame({
'Date':['2013-11-25','2013-11-25','2013-11-25','2013-11-25','2013-11-25','2013-11-25'],
'Fruit':['Banana','Orange','Apple','Celery','X','Y'],
'Num':[22.1,8.6,7.6,10.2,22.1,8.6],
'Color':['Yellow','Orange','Green','Green','Red','Orange'],
})
print(df2)
Date Fruit Num Color
0 2013-11-25 Banana 22.1 Yellow
1 2013-11-25 Orange 8.6 Orange
2 2013-11-25 Apple 7.6 Green
3 2013-11-25 Celery 10.2 Green
4 2013-11-25 X 22.1 Red
5 2013-11-25 Y 8.6 Orange
I am trying to find out the difference between these two dataframes based on the column Fruit
This is what i am doing now but i am not getting the expected output
mapped_df = pd.concat([df1,df2],ignore_index=True).drop_duplicates(keep=False)
print(mapped_df)
Expected output
Date Fruit Num Color
8 2013-11-25 X 22.1 Red
9 2013-11-25 Y 8.6 Orange
CodePudding user response:
You can use the negated isin
:
output = df2.loc[~df2['Fruit'].isin(df1['Fruit'])]