Is there a simple function (both on pandas or numpy) to create a new column with true
or false
values, based on matching criteria from different dataframes?
I'm actually trying to compare two dataframes that have the column email
and see, for example, which emails match with the emails on the second data frame. The goal is to print a table that looks like this (where [email protected]
it's actually both on the first and second dataframe):
| id | email | match |
|:------|:------ |:-------|
| 1 | [email protected] | true|
| 2 | [email protected] | false|
| 3 | [email protected] | false|
Thanks in advance for your help
CodePudding user response:
df1 = df1.assign(
match=np.where(df2["email"].isin(df1["email"]), True, False)
)
CodePudding user response:
You can for example use the function isin
:
df1['match'] = df1['email'].isin(df2['email'])
df2['match'] = df2['email'].isin(df1['email'])