Home > Software design >  pandas replace dataframe float values using int keys of nested dict
pandas replace dataframe float values using int keys of nested dict

Time:05-24

I have a dataframe and dictionary like as shown below

ID,Name,value,total,     
1,Ajay,2.00,35          
1,Dan,3.00,65
2,Ajay,2,78
2,Rajini,0.0,98           
3,Ajay,3.00,53    
3,Rad,75.25,21

df1 = pd.read_clipboard(sep=',')

output = {'Ajay': {1: 'ABC', 2: 'DEF', 3: 'DUMMA', 4: 'CHUMMA'}, 'Dan': {0: 'KOREA', 1: 'AUS/NZ', 2: 'INDIA', 3: 'ASEAN'}}

I would like to do the below

a) Replace the values in value column by matching Name value to Name key in nested dict.

For ex: ID=1 has Name as Ajay and value as 2.00.

Now, if we look at the dict, we Ajay outer key and trying to find the matching key (which is 2). So, we replace value 2.00 with DEF.

Similarly, we do this for other Name which is Dan.

I tried the below

df1.replace({"values": output},inplace=True)  # doesn't work
for d in output.values():
    print(d.key())  

Is there any efficient and elegant way to do this sort of replacement for million rows dataframe?

I expect my output to be like as below

enter image description here

CodePudding user response:

Try this

# map value-Names to output and fill the missing values with original values
df1['value'] = pd.Series(df1.set_index(['value','Name']).index.map(pd.DataFrame(output).stack())).fillna(df1['value'])
print(df1)
   ID    Name  value  total
0   1    Ajay    DEF     35
1   1     Dan  ASEAN     65
2   2    Ajay    DEF     78
3   2  Rajini    0.0     98
4   3    Ajay  DUMMA     53
5   3     Rad  75.25     21

CodePudding user response:

Use DataFrame.join with DataFrame.stack for new column and then use Series.fillna for replace not matched values by value column:

df = df.join(pd.DataFrame(output).stack().rename('new'), on=['value','Name'])
df['value'] = df.pop('new').fillna(df['value'])
print (df)
   ID    Name  value  total
0   1    Ajay    DEF     35
1   1     Dan  ASEAN     65
2   2    Ajay    DEF     78
3   2  Rajini    0.0     98
4   3    Ajay  DUMMA     53
5   3     Rad  75.25     21

CodePudding user response:

You can try loop the dictionary then use df.mask or np.where or loc with boolean indexing to replace value.

for k, v in output.items():
    df['value'] = df['value'].mask(df['Name'].eq(k), df['value'].map(v))
    #df['value'] = np.where(df['Name'].eq(k), df['value'].map(v), df['value'])
    #df.loc[df['Name'].eq(k), 'value'] = df.loc[df['Name'].eq(k), 'value'].map(v)
print(df)

   ID    Name  value  total
0   1    Ajay    DEF     35
1   1     Dan  ASEAN     65
2   2    Ajay    DEF     78
3   2  Rajini    0.0     98
4   3    Ajay  DUMMA     53
5   3     Rad  75.25     21
  • Related