Home > Blockchain >  Replace values in specific rows from one DataFrame to another when certain columns have the same val
Replace values in specific rows from one DataFrame to another when certain columns have the same val

Time:01-15

Unlike the other questions, I don't want to create a new column with the new values, I want to use the same column just changing the old values for new ones if they exist.

For a new column I would have:

import pandas as pd

df1 = pd.DataFrame(data = {'Name' : ['Carl','Steave','Julius','Marcus'], 
                           'Work' : ['Home','Street','Car','Airplane'],
                           'Year' : ['2022','2021','2020','2019'],
                           'Days' : ['',5,'','']})

df2 = pd.DataFrame(data = {'Name' : ['Carl','Julius'], 
                           'Work' : ['Home','Car'],
                           'Days' : [1,2]})

df_merge = pd.merge(df1, df2, how='left', on=['Name','Work'], suffixes=('','_'))
print(df_merge)
     Name      Work  Year Days  Days_
0    Carl      Home  2022         1.0
1  Steave    Street  2021    5    NaN
2  Julius       Car  2020         2.0
3  Marcus  Airplane  2019         NaN

But what I really want is exactly like this:

     Name      Work  Year Days
0    Carl      Home  2022    1
1  Steave    Street  2021    5
2  Julius       Car  2020    2
3  Marcus  Airplane  2019     

How can I make such a union?

CodePudding user response:

You can use combine_first, setting the empty strings to NaNs beforehand (the indexing at the end is to rearrange the columns to match the desired output):

df1.loc[df1["Days"] == "", "Days"] = float("NaN")
df1.combine_first(df1[["Name", "Work"]].merge(df2, "left"))[df1.columns.values]

This outputs:

     Name      Work  Year Days
0    Carl      Home  2022  1.0
1  Steave    Street  2021    5
2  Julius       Car  2020  2.0
3  Marcus  Airplane  2019  NaN

CodePudding user response:

You can use the update method of Series:

df1.Days.update(pd.merge(df1, df2, how='left', on=['Name','Work']).Days_y)
  • Related