Replace values in specific rows from one DataFrame to another when certain columns have the same val-CodePudding

Unlike the other questions, I don't want to create a new column with the new values, I want to use the same column just changing the old values for new ones if they exist.

For a new column I would have:

import pandas as pd

df1 = pd.DataFrame(data = {'Name' : ['Carl','Steave','Julius','Marcus'], 
                           'Work' : ['Home','Street','Car','Airplane'],
                           'Year' : ['2022','2021','2020','2019'],
                           'Days' : ['',5,'','']})

df2 = pd.DataFrame(data = {'Name' : ['Carl','Julius'], 
                           'Work' : ['Home','Car'],
                           'Days' : [1,2]})

df_merge = pd.merge(df1, df2, how='left', on=['Name','Work'], suffixes=('','_'))
print(df_merge)

     Name      Work  Year Days  Days_
0    Carl      Home  2022         1.0
1  Steave    Street  2021    5    NaN
2  Julius       Car  2020         2.0
3  Marcus  Airplane  2019         NaN

But what I really want is exactly like this:

     Name      Work  Year Days
0    Carl      Home  2022    1
1  Steave    Street  2021    5
2  Julius       Car  2020    2
3  Marcus  Airplane  2019

How can I make such a union?

CodePudding user response：

You can use combine_first, setting the empty strings to NaNs beforehand (the indexing at the end is to rearrange the columns to match the desired output):

df1.loc[df1["Days"] == "", "Days"] = float("NaN")
df1.combine_first(df1[["Name", "Work"]].merge(df2, "left"))[df1.columns.values]

This outputs:

     Name      Work  Year Days
0    Carl      Home  2022  1.0
1  Steave    Street  2021    5
2  Julius       Car  2020  2.0
3  Marcus  Airplane  2019  NaN

CodePudding user response：

You can use the update method of Series:

df1.Days.update(pd.merge(df1, df2, how='left', on=['Name','Work']).Days_y)