Home > Software engineering >  replacing a column value in panda dataframe
replacing a column value in panda dataframe

Time:09-30

I have a dataframe in panda which has values like this

    Id   ExternalID  class  Unit    Description
0   XYZ     67709           bara      old values
1   ABC     5121     0204   °C      0 values added

and i have csv file which has a mapping of Ids with new Ids which needs to be replaced in above dataframe

My csv file :

Id    NewId
XYZ   OSS_XYZ
ABC   POW_ABC_A

How can I have my dataframe updated so that I have Id column in dataframe replace with NewId columns in my csv file with correct mapping? for example after mapping my dataframe should be like this

    Id         ExternalID    class  Unit    Description
0   OSS_XYZ       67709             bara      old values
1   POW_ABC_A     5121        0204  °C      0 values added

CodePudding user response:

Try with replace

df = df.replace({'Id': dict(zip(df2.Id,df2.NewId))})

CodePudding user response:

first create dataframe from csv then create dictionary from df2 then update column Id with key in dictionary like below:

>>> df2 = pd.read_csv(csvfile)
>>> dct_id = dict(zip(df2.Id,df2.NewId))
{'XYZ': 'OSS_XYZ' , 'ABC':'POW_ABC_A'}

>>> df['Id'].update(pd.Series(dct_id))
    Id         ExternalID    class   Unit    Description
0   OSS_XYZ       67709              bara      old values
1   POW_ABC_A     5121        0204   °C      0 values added

CodePudding user response:

Assuming your mapping dataframe is df2 = pd.read_csv(csvfile), set df2's index to Id and use it to map df['Id']:

df['Id'] = df['Id'].map(df2.set_index('Id')['NewId'])

#           Id  ExternalID  class  Unit     Description
# 0    OSS_XYZ       67709    NaN  bara      old values
# 1  POW_ABC_A        5121  204.0    °C  0 values added

CodePudding user response:

Use the replace method.

df2 = pd.read_csv('mydata.csv') # read the csv

old = df2['Id'].tolist() # the old IDs you want to replace
new = df2['NewId'].tolist() # the new IDs to replace them with

df['Id'].replace(to_replace=old, value=new, inplace=True) # replace

print(df) # view them
  • Related