Home > Back-end >  Regex to remove white spaces after `<` and `/` in HTML tag
Regex to remove white spaces after `<` and `/` in HTML tag

Time:12-15

I have a text like this and want to eliminate white spaces only after < and / charater to avoid errors while parsing it. Input:

< lesson id="024AC57B0CA72ADE" classids="5B111F8CD42D0943" / >

Output:

<lesson id="024AC57B0CA72ADE" classids="5B111F8CD42D0943" />

Note: I don't want to eliminate white space before > everytime. Only just after / as this is vaild

</lesson >

but this is not

</ lesson>

Regex i tried but couldn't cover all cases: https://regex101.com/r/0LuV0O/1

Thanks in advance

CodePudding user response:

Think of this problem as removing the spaces after < and /.

'< lesson id="024AC57B0CA72ADE" classids="5B111F8CD42D0943" / >'
  .replace(/([</])\s*/g, '$1')

Output:

<lesson id="024AC57B0CA72ADE" classids="5B111F8CD42D0943" />

CodePudding user response:

Here is a regular expression that you can use to remove white spaces after < and / in an HTML tag:

re.sub(r'(?<=[<\/])\s ', '', html_string)

This regular expression uses a positive lookbehind assertion to match any white space characters (\s ) that appear immediately after a < or / character in the HTML string. The matched white space is then replaced with an empty string using the re.sub() method.

  • Related