I have a text like this and want to eliminate white spaces only after <
and /
charater to avoid errors while parsing it.
Input:
< lesson id="024AC57B0CA72ADE" classids="5B111F8CD42D0943" / >
Output:
<lesson id="024AC57B0CA72ADE" classids="5B111F8CD42D0943" />
Note:
I don't want to eliminate white space before >
everytime. Only just after /
as
this is vaild
</lesson >
but this is not
</ lesson>
Regex i tried but couldn't cover all cases: https://regex101.com/r/0LuV0O/1
Thanks in advance
CodePudding user response:
Think of this problem as removing the spaces after <
and /
.
'< lesson id="024AC57B0CA72ADE" classids="5B111F8CD42D0943" / >'
.replace(/([</])\s*/g, '$1')
Output:
<lesson id="024AC57B0CA72ADE" classids="5B111F8CD42D0943" />
CodePudding user response:
Here is a regular expression that you can use to remove white spaces after < and / in an HTML tag:
re.sub(r'(?<=[<\/])\s ', '', html_string)
This regular expression uses a positive lookbehind assertion to match any white space characters (\s ) that appear immediately after a < or / character in the HTML string. The matched white space is then replaced with an empty string using the re.sub() method.