Home > Back-end >  How to change string based on list in pandas
How to change string based on list in pandas

Time:10-27

I have a mapper as follows

MAPPER = {
    'g': ['gm', 'gram', 'grams', 'gms'],
    'ml': ['mls', 'milli-litre', 'mili-litre', 'milli litre', 'mili litre'],
    'kg': ['kilo', 'kilo-gram', 'kilo gram', 'kilo grams'] 
}

and a pandas series as follows

Salt 500 gm
Sugar Powder 500 gm
Sugar 500 gm
Flour 500 gm
Repellent 10 mls

I want to change the gm and mls to the key from the mapper such that the result is as follows

Salt 500 g
Sugar Powder 500 g
Sugar 500 g
Flour 500 g
Repellent 10 ml

How do I go about doing this?

CodePudding user response:

First flatten nested list of dict to dictonary with words boundaries and pass to Series.replace:

s = s.replace({rf'\b{x}\b': k  for k, v in MAPPER.items() for x in v}, regex=True)
print (s)
0            Salt 500 g
1    Sugar Powder 500 g
2           Sugar 500 g
3           Flour 500 g
4       Repellent 10 ml
Name: a, dtype: object

If need always repalce unit if last part of strings add $ for match end of strings:

s = s.replace({rf'\b{x}\b$': k  for k, v in MAPPER.items() for x in v}, regex=True)

CodePudding user response:

One approach:

MAPPER = {
    'g': ['gm', 'gram', 'grams', 'gms'],
    'ml': ['mls', 'milli-litre', 'mili-litre', 'milli litre', 'mili litre'],
    'kg': ['kilo', 'kilo-gram', 'kilo gram', 'kilo grams']
}

lookup = { v : k for k, vs in MAPPER.items() for v in vs }
res = ser.str.replace(rf"\b({'|'.join(lookup)})\b", lambda x: lookup[x.group()], regex=True)
print(res)

Output

0            Salt 500 g
1    Sugar Powder 500 g
2           Sugar 500 g
3           Flour 500 g
4       Repellent 10 ml
Name: 0, dtype: object
  • Related