replace values to a dataframe column from another dataframe-CodePudding

I am trying to replace values from a dataframe column with values from another based on a third one and keep the rest of the values from the first df.

# df1
country   name   value
romania   john   100
russia    emma   200
sua       mark   300
china     jack   400

# df2
name   value
emma   2
mark   3

Desired result:

# df3
country   name   value
romania   john   100
russia    emma   2
sua       mark   3
china     jack   400

Thank you

CodePudding user response：

One approach could be as follows:

Use Series.map on column name and turn df2 into a Series for mapping by setting its index to name (df.set_index).
Next, chain Series.fillna to replace NaN values with original values from df.value (i.e. whenever mapping did not result in a match) and assign to df['value'].

df['value'] = df['name'].map(df2.set_index('name')['value']).fillna(df['value'])

print(df)

   country  name  value
0  romania  john  100.0
1   russia  emma    2.0
2      sua  mark    3.0
3    china  jack  400.0

N.B. The result will now contain floats. If you prefer integers, chain .astype(int) as well.

CodePudding user response：

Another option:

df3          = df1.merge(df2, on = 'name', how = 'left')
df3['value'] = df3.value_y.fillna(df3.value_x) 
df3.drop(['value_x', 'value_y'], axis = 1, inplace = True)

#    country  name  value
# 0  romania  john  100.0
# 1   russia  emma    2.0
# 2      sua  mark    3.0
# 3    china  jack  400.0

Reproducible data:

df1=pd.DataFrame({'country':['romania','russia','sua','china'],'name':['john','emma','mark','jack'],'value':[100,200,300,400]})
df2=pd.DataFrame({'name':['emma','mark'],'value':[2,3]})

CodePudding user response：

Another option could be using pandas.DataFrame.Update:

df1.set_index('name', inplace=True)
df1.update(df2.set_index('name'))
df1.reset_index(inplace=True)

   name  country  value
0  john  romania  100.0
1  emma   russia    2.0
2  mark      sua    3.0
3  jack    china  400.0