Get values from another dataframe if NaN after merge by incomplete name-CodePudding

I have two dataFrames that after merge by "Name" some rows retun NaN because the "Names" are incomplete.

df1

Name	Info 1
Walter	Adress 1
john wick	Adress 1

df2

Name	Info 2
Walter White	Male
john wick	Male

df2 = pd.merge(df1,df2,on='Name', how='left')

I'm geting

Name	Info 1	Info 2
Walter	NaN	NaN
john wick	Adress 1	Male

I Want

Name	Info 1	Info 2
Walter White	Adress 1	Male
john wick	Adress 1	Male

How can I treat rows, to try get values by substring, if return NaN? I dont know if use merge in first time was the best logic.

CodePudding user response：

Try this:

df2 = pd.merge_asof(df1,df2,on='Name', how='left')

this depends on the resemblance of the different values

CodePudding user response：

The reason its not working is because pandas doesn't consider "Walter" and "Walter White" as same values. Thus when you perform a left join on df1 it keeps all the values of df1 and adds the values from df2 that have the same "Name" column values. Since walter is not present in df2 it adds NaN in info2 column(again "walter" and "walter white" are different). One way you could solve this is by creating two separate columns for "First_Name" and "Last_Name" and then try merging on "First_Name" something like

df1["First_Name"] = df1.apply(lambda row: row['Name'].split()[0], axis = 1)
df2["First_Name"] = df2.apply(lambda row: row['Name'].split()[0], axis = 1)

Then simply use the same merge as you did... df2 = pd.merge(df1,df2,on='First_Name', how='left')