I'm trying to merge 2 dataframes by name where one column is name last name and the other only first name initial last name.
Example:
df1:
name
John Doe
df2:
name
J. Doe
CodePudding user response:
You can use a regex to change the format of df1['name']
and feed this to merge
:
df1.merge(df2,
left_on=df1['name'].str.replace(r'^(.)\w ', r'\1.', regex=True),
right_on='name'
)
NB. check the documentation of merge
for more parameters to format the output, in particular suffixes
to customize the column names.
output:
name name_x name_y
0 J. Doe John Doe J. Doe
example input:
df1 = pd.DataFrame({'name': ['John Doe']})
df2 = pd.DataFrame({'name': ['J. Doe']})