Home > Net >  Get values from another dataframe if NaN after merge by incomplete name
Get values from another dataframe if NaN after merge by incomplete name

Time:09-07

I have two dataFrames that after merge by "Name" some rows retun NaN because the "Names" are incomplete.

df1

Name Info 1
Walter Adress 1
john wick Adress 1

df2

Name Info 2
Walter White Male
john wick Male
df2 = pd.merge(df1,df2,on='Name', how='left')

I'm geting

Name Info 1 Info 2
Walter NaN NaN
john wick Adress 1 Male

I Want

Name Info 1 Info 2
Walter White Adress 1 Male
john wick Adress 1 Male

How can I treat rows, to try get values by substring, if return NaN? I dont know if use merge in first time was the best logic.

CodePudding user response:

Try this:

df2 = pd.merge_asof(df1,df2,on='Name', how='left')

this depends on the resemblance of the different values

CodePudding user response:

The reason its not working is because pandas doesn't consider "Walter" and "Walter White" as same values. Thus when you perform a left join on df1 it keeps all the values of df1 and adds the values from df2 that have the same "Name" column values. Since walter is not present in df2 it adds NaN in info2 column(again "walter" and "walter white" are different). One way you could solve this is by creating two separate columns for "First_Name" and "Last_Name" and then try merging on "First_Name" something like

df1["First_Name"] = df1.apply(lambda row: row['Name'].split()[0], axis = 1)
df2["First_Name"] = df2.apply(lambda row: row['Name'].split()[0], axis = 1)

Then simply use the same merge as you did... df2 = pd.merge(df1,df2,on='First_Name', how='left')

  • Related