Home > OS >  Flag a row if column values present in another dataframe
Flag a row if column values present in another dataframe

Time:09-07

I have the following two dataframes.

df1:

     name  age   
0   Alice   12   
1     Bob   32         
2   Chuck    9            
3   Daren   76             
4    Eddy   21  

and df2:

     name  age  hair_color 
0   Alice   12       brown 
1     Bob   32      blonde  
2    Cory   36       brown     
3   David    3       white       
4    Eddy   21      orange

I want to extend df1 to include a column 'match', such that whenever the ['name', 'age'] matches in df2, it will include a 1 in the column, otherwise a 0. My goal is to have the resulting dataframe:

     name  age  match 
0   Alice   12      1
1     Bob   32      1  
2   Chuck    9      0     
3   Daren   76      0      
4    Eddy   21      1

Any tips for best approaches? Thanks!

CodePudding user response:

You can do merge

df = df1.merge(df2,how='left').assign(match = lambda x : x['hair_color'].notna().astype(int)).drop(['hair_color'],axis= 1)
Out[17]: 
    name  age  match
0  Alice   12      1
1    Bob   32      1
2  Chuck    9      0
3  Daren   76      0
4   Eddy   21      1

CodePudding user response:

A bit verbose but this is just another way:

df = df.merge(df2.iloc[:, 0:2], how='left', on=['name', 'age'], indicator=True)
df.rename(columns={'_merge' : 'match'}, inplace=True)
df['match'] = np.select([df['match'].eq('left_only')], [0], 1)

    name  age  match
0  Alice   12      1
1    Bob   32      1
2  Chuck    9      0
3  Daren   76      0
4   Eddy   21      1
  • Related