Home > Software engineering >  Python regex backreference to swap characters in a string
Python regex backreference to swap characters in a string

Time:07-04

I am trying to swap h and w in "hello World" as "wello horld" using back referencing. I am able to capture the group but something goes wrong when I refer the group in sub() method.

import re
st = "hello world"
t = re.compile(r"(\w). \s(\w). ")
res = t.sub(r"\2 \1",st)
print(res)

I get output as "w h" instead of the desired string. What am I missing ?

CodePudding user response:

Your regex approach has a problem. You should capture, for each word, two groups, the first character and rest of the word. Actually, we can just capture the first letter of the second word.

st = "hello world"
output = re.sub(r'(\w)(\w*) (\w)', r'\3\2 \1', st)
print(output)  # wello horld

CodePudding user response:

You are not capturing the first match correctly. The following will work for you:

import re st = "hello world" t = re.compile(r"(?<=\w)\.(?=\w)") res = t.sub(r"\2 \1",st) print(res)

Explanation: (?<=\w) is a positive look-behind assertion which ensures that we have at least one word character before us while matching any char but a whitespace or end of line, and similarly (?=\w) is a positive look-ahead assertion which makes sure that there's a word char after our current position. Also note how . matches both a literal dot as well as a special regex metacharacter. Hope it helps!

  • Related