Home > Back-end >  How to search a dataframe value based on another dataframe value?
How to search a dataframe value based on another dataframe value?

Time:03-16

I have two DFs:

df1:
Date       Fruit  Num  Color 
2013-11-24 Banana 22.1 Yellow
2013-11-24 Orange  8.6 Orange
2013-11-24 Apple   7.6 Green
2013-11-24 Celery 10.2 Green

df2:
Date       Fruit  Num  Color 
2013-11-24 Banana 22.1 Yellow
2013-11-24 Orange  8.6 Orange
2013-11-24 Apple   7.6 Green
2013-11-24 Celery 10.2 Green
2013-11-25 Apple  22.1 Red
2013-11-25 Orange  8.6 Orange

Now I would like to compare the two dfs and put a column 'True' in df2 when the color column of df2 is residing in df1.

desired output:

Date       Fruit  Num  Color  Match
2013-11-24 Banana 22.1 Yellow True
2013-11-24 Orange  8.6 Orange True
2013-11-24 Apple   7.6 Green  True
2013-11-24 Celery 10.2 Green  True
2013-11-25 Apple  22.1 Red    False
2013-11-25 Orange  8.6 Orange True

I came up with the following:

df2['Match'] = np.where(df2['Match'] == df1, True, False)

However got the following error:

ValueError: Can only compare identically-labeled Series objects

And tried the following

flat_user_data['Match'] = np.where(df2['Color'].isin(df1['Color']), True, False)

ValueError: Length of values (5) does not match length of index (10798)

CodePudding user response:

IIUC, Series.isin:

df2['Match'] = df2['Color'].isin(df1['Color'])

Or np.isin:

df2['Match'] = np.isin(df2['Color'], df1['Color'])

CodePudding user response:

In addition to what @ansev suggested, you can achieve this goal by:

checkList = list(df1["Color"])
df2['Match'] = [True if x in checkList else False for x in df2["Color"]]
df2

Output

Date Fruit Num Color Match
0 2013-11-24 Banana 22.1 Yellow True
1 2013-11-24 Orange 8.6 Orange True
2 2013-11-24 Apple 7.6 Green True
3 2013-11-24 Celery 10.2 Green True
4 2013-11-25 Apple 22.1 Red False
5 2013-11-25 Orange 8.6 Orange True

Note that you can use [x in checkList for x in df2["Color"]] instead of [True if x in checkList else False for x in df2["Color"]] (Thanks to @ansev for the comment)

  • Related