Home > OS >  Pandas replace using dictionary and regular expression
Pandas replace using dictionary and regular expression

Time:09-30

I want to replace the following regular expression:

(?<!\d)(\d{2}\-\d{2})(?!\d)

with specific value.

For example

column = pd.Series(['01-01', '01-01 qwerasdf 0101-0101'])

I want to replace all '01-01' with '0101'

(but no digits before and after '01-01' so '0101-0101' will remain unchanged)

I can use the following to get what I want.

column = column.str.replace(r'(?<!\d)(\d{2}\-\d{2})(?!\d)', '0101', regex=True)

But now I have a dictionary to replace with

{'01-01': '0101', '01-02': '0102'...}

How can I use regular expression and dictionary at the same time in the replace function?

CodePudding user response:

Use callback with dictionary, if no match return same value like for 05-07 with dict.get:

column = pd.Series(['01-01', '01-01 qwerasdf 0101-0101', '01-02 aa', '05-07 dd'])

d = {'01-01': '0101', '01-02': '0102'}
column = column.str.replace(r'(?<!\d)(\d{2}\-\d{2})(?!\d)', 
                            lambda x: d.get(x.group(), x.group()), 
                            regex=True)


print (column)
0                       0101
1    0101 qwerasdf 0101-0101
2                    0102 aa
3                   05-07 dd
dtype: object
  • Related