I have the following two dataframes.
df1:
name age
0 Alice 12
1 Bob 32
2 Chuck 9
3 Daren 76
4 Eddy 21
and df2:
name age hair_color
0 Alice 12 brown
1 Bob 32 blonde
2 Cory 36 brown
3 David 3 white
4 Eddy 21 orange
I want to extend df1 to include a column 'match', such that whenever the ['name', 'age'] matches in df2, it will include a 1
in the column, otherwise a 0
. My goal is to have the resulting dataframe:
name age match
0 Alice 12 1
1 Bob 32 1
2 Chuck 9 0
3 Daren 76 0
4 Eddy 21 1
Any tips for best approaches? Thanks!
CodePudding user response:
You can do merge
df = df1.merge(df2,how='left').assign(match = lambda x : x['hair_color'].notna().astype(int)).drop(['hair_color'],axis= 1)
Out[17]:
name age match
0 Alice 12 1
1 Bob 32 1
2 Chuck 9 0
3 Daren 76 0
4 Eddy 21 1
CodePudding user response:
A bit verbose but this is just another way:
df = df.merge(df2.iloc[:, 0:2], how='left', on=['name', 'age'], indicator=True)
df.rename(columns={'_merge' : 'match'}, inplace=True)
df['match'] = np.select([df['match'].eq('left_only')], [0], 1)
name age match
0 Alice 12 1
1 Bob 32 1
2 Chuck 9 0
3 Daren 76 0
4 Eddy 21 1