Home > Software design >  Replace a regex pattern in a string with another regex pattern in Python
Replace a regex pattern in a string with another regex pattern in Python

Time:10-02

Is there a way to replace a regex pattern in a string with another regex pattern? I tried this but it didn't work as intended:

s = 'This is a test. There are two tests'
re.sub(r'\btest(s)??\b', "<b><font color='blue'>\btest(s)??\b</font></b>", s)

The output was:

"This is a <b><font color='blue'>\x08test(s)??\x08</font></b>. There are two <b><font color='blue'>\x08test(s)??\x08</font></b>"

Instead of the desired result of enclosing the keyword test and tests with html tags:

"This is a <b><font color='blue'>\test</font></b>. There are two <b><font color='blue'>tests</font></b>"

And if there was a workaround, how could I apply that to a text column in a dataframe?

Thanks in advance.

CodePudding user response:

You can use a function to replace.

import re


def replacer(match):
    if match[0] == 'test':
        return "<b><font color='blue'>test</font></b>"
    if match[0] == 'tests':
        return "<b><font color='blue'>tests</font></b>"


s = 'This is a test. There are two tests'
ss = re.sub(r'\btest(s)??\b', replacer, s)
print(ss)
This is a <b><font color='blue'>test</font></b>. There are two <b><font color='blue'>tests</font></b>

CodePudding user response:

If in result you want to put element which it found in original text then you have to put regex in () (to catch it) and later use \1 to put this element in result.

re.sub(r'(\btest(s)??\b)', r"<b><font color='blue'>\1</font></b>", s)

BTW: it needs also prefix r in result to treat \ as normal char.

Result:

"This is a <b><font color='blue'>test</font></b>. There are two <b><font color='blue'>tests</font></b>"

If you will use more () then every () will catch separated elements and every element will have own number \1, \2, etc.

For example

re.sub(r'(.*) (.*)', r'\2 \1', 'first second')

gives:

'second first'

In example it catchs also (s) and it has number \2

  • Related