Home > database >  Python3 re.sub only replace once, how to replace all? [duplicate]
Python3 re.sub only replace once, how to replace all? [duplicate]

Time:10-09

I am using python3 regex to treat text with <span> tags. The purpose is to remove all <span> tags. According to regex document, re.sub has four arguments, count=0 means replace all.

The sample code is here:

import re
text = "\n<span><div>\n<span>Test string</span>\n</div></span>\n"
patten = re.compile('(.*)(<span .*?>|<span>)(.*?)</span>(.*)',re.IGNORECASE|re.MULTILINE|re.DOTALL)
text1=patten.sub(r'\1\n\3\n\4', text)
print("before:" text "\n" "after:" text1)

The output is here:

before:
<span><div>
<span>Test string</span>
</div></span>

after:
<span><div>
 Test string
</div></span>

The input string has two <span> tags, the output is expected no <span> tag. The code result is only removed one and still remained one. What's wrong of my code? Thanks very much.

Qian

CodePudding user response:

Hope its help you.

import re 
text = "\n<span><div>\n<span>Test string</span>\n</div></span>\n"
patten = re.compile(r'</?span[^>]*>',re.IGNORECASE|re.MULTILINE|re.DOTALL)
text1=re.sub(r'</?span[^>]*>', '', text)
print("before:" text "\n" "after:" text1)
  • Related