So in VS Code I used this <script>(.|\n)*?<\/script>
regex pattern to select everything between <script>
tags (including tags) and it worked great. (See the example below)
<html>
<p>dsldsdsd</p>
<p>dsldsdsd</p>
<p>dsldsdsd</p>
*<script>
Some code
</script>*
*<script>
Some code
</script>*
<p>dsldsdsd<p>
<p>dsldsdsd<p>
</html>
So with this <script>(.|\n)*?<\/script>
everything between * * gets selected.
Now what I actually want to do is do the opposite of what I've shown you. For example, like this. Select everything else but leave inside<script> </script>
tags. (Along with the tag)
*<html>
<p>dsldsdsd</p>
<p>dsldsdsd</p>
<p>dsldsdsd</p>*
<script>
Some code
</script>
<script>
Some code
</script>
*<p>dsldsdsd</p>
<p>dsldsdsd</p>
</html>*
So I went through some regex documents online and I tried the following regex to select everything else (and keep everything between <script>
tags)
^((?!<script>(.|\n)*?<\/script>).)*$
But this just keeps the word <script>
. What have I done wrong?
In short, what I'm trying to do is negate the <script>(.|\n)*?<\/script>
expression.
Any help is appreciated. Thanks.
CodePudding user response:
An idea is to match what you don't want but capture what you need to \1
<script>[\s\S]*?<\/script>|((?:<(?!script)|[^<])[\s\S]*?)(?=<script|$)
To not skip over an opening <script
in the alternation either match a character, that is not <
or match a <
which is not followed by script
by use of a lookahead until <script
occurs or $
end.