Home > other >  pandas: replace with a dictionary does not work with string of sentences
pandas: replace with a dictionary does not work with string of sentences

Time:12-08

I have a dataframe as follows:

import pandas as pd
df = pd.DataFrame({'text':['Lary Page is visiting today',' His boss, Maria Jackson is here.']})

I have extracted the names in the list below. and used faker library to create fake names equal to the len of the person_name list, and created a dictionary out of the lists.

from faker import Faker
fake = Faker()

person_name = ['Lary Page', 'Maria Jackson']
fake_name= [fake.name() for n in range(len(person_name))]
name_dict = dict(zip(person_name, fake_name ))

now I would like to replace them in the dataframe using the dictionary, but it returns an error.

df.text.str.replace(name_dict)

my desired output:(e.g)

print(df)

Angela Mindeston is visiting today
His boss, Emanuel Smith is here.

CodePudding user response:

Use callback with lambda for Series.str.replace or Series.replace:

regex = '|'.join(r"\b{}\b".format(x) for x in name_dict.keys())
df['text1'] = df.text.str.replace(regex, lambda x: name_dict[x.group()], regex=True)

df['text2'] = df.text.replace(name_dict, regex=True)
print (df)
                                text                                 text1  \
0        Lary Page is visiting today            Gary Cox is visiting today   
1   His boss, Maria Jackson is here.   His boss, Mr. George Jones is here.   

                                  text2  
0            Gary Cox is visiting today  
1   His boss, Mr. George Jones is here.  
  • Related