Home > Enterprise >  How to replace values of a column based on another data frame?
How to replace values of a column based on another data frame?

Time:11-19

I have a column containing symbols of chemical elements and other substances. Something like this:

Commoditie
sn
sulfuric acid
cu
sodium chloride
au
df1 = pd.DataFrame(['sn', 'sulfuric acid', 'cu', 'sodium chloride', 'au'], columns=['Commodities'])

And I have another data frame containing the symbols of the chemical elements and their respective names. Like this:

Name Symbol
sn tin
cu copper
au gold
df2 = pd.DataFrame({'Name': ['tin', 'copper', 'gold'], 'Symbol': ['sn', 'cu', 'au']})

I need to replace the symbols (in the first dataframe)(df1['Commoditie']) with the names (in the second one) (df2['Names']), so that it outputs like the following:

I need the Output:

Commoditie
tin
sulfuric acid
copper
sodium chloride
gold

I tried using for loops and lambda but got different results than expected. I have tried many things and googled, I think it's something basic, but I just can't find an answer.

Thank you in advance!

CodePudding user response:

Try:

for i, row in df2.iterrows():
    df1.Commodities = df1.Commodities.str.replace(row.Symbol, row.Name)

which gives df1 as:

       Commodities
0              tin
1    sulfuric acid
2           copper
3  sodium chloride
4             gold

EDIT: Note that it's very likely to be far more efficient to skip defining df2 at all and just zip your lists of names and symbols together and iterate over that.

CodePudding user response:

first, convert df2 to a dictionary:

replace_dict=dict(df2[['Symbol','Name']].to_dict('split')['data'])
#{'sn': 'tin', 'cu': 'copper', 'au': 'gold'}

then use replace function:

df1['Commodities']=df1['Commodities'].replace(replace_dict)
print(df1)
'''
       Commodities
0              tin
1    sulfuric acid
2           copper
3  sodium chloride
4             gold
'''
  • Related