I am trying to swap h and w in "hello World" as "wello horld" using back referencing. I am able to capture the group but something goes wrong when I refer the group in sub() method.
import re
st = "hello world"
t = re.compile(r"(\w). \s(\w). ")
res = t.sub(r"\2 \1",st)
print(res)
I get output as "w h" instead of the desired string. What am I missing ?
CodePudding user response:
Your regex approach has a problem. You should capture, for each word, two groups, the first character and rest of the word. Actually, we can just capture the first letter of the second word.
st = "hello world"
output = re.sub(r'(\w)(\w*) (\w)', r'\3\2 \1', st)
print(output) # wello horld
CodePudding user response:
You are not capturing the first match correctly. The following will work for you:
import re st = "hello world" t = re.compile(r"(?<=\w)\.(?=\w)") res = t.sub(r"\2 \1",st) print(res)
Explanation: (?<=\w) is a positive look-behind assertion which ensures that we have at least one word character before us while matching any char but a whitespace or end of line, and similarly (?=\w) is a positive look-ahead assertion which makes sure that there's a word char after our current position. Also note how . matches both a literal dot as well as a special regex metacharacter. Hope it helps!