I have the below HTML file (simplified code):
<!DOCTYPE html>
<html lang="en" class="no-js">
...
<body>
This is my header!
{% include "header.tpl.html"%}
</body>
</html>
And I have the below code snippet in JS that is supposed to replace all occurences of "header" in the HTML:
const v = 'header';
let re = new RegExp(`${v}`, 'gi');
fileStr = fileStr.replace(re, '{% test %}');
As I result, I get:
<!DOCTYPE html>
<html lang="en" class="no-js">
...
<body>
This is my {% test %}!
{% include "{% test %}.tpl.html"%}
</body>
</html>
So, the first {% test %} replaced works as expected, but the regular expression also detects occurrences inside curly brackets, which messes up my template.
How do I build a regular expression that skips all text between {% and %} markers?
CodePudding user response:
You can match the strings inside{%...%}
and capture them in order to be able to skip them:
var fileStr = ' This is my header!\n {% include "header.tpl.html"%}';
let v = 'header';
let re = new RegExp(`({%[^]*?%})|${v}`, 'gi');
fileStr = fileStr.replace(re, (x,y) => y || `{% test %}`);
console.log(fileStr);
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>
The ({%[^]*?%})|header
pattern matches and captures {%
zero or more chars as few as possible, and then %}
into Group 1, or matches header
string in a case insensitive way in any other context.
Then, if Group 1 matches, the replacement is the Group 1 value, so the {%...%}
substrings are kept as is, and then header
is replaced with test
.
You may also use
let re = new RegExp(`({%[^]*?%})|\\b${v}\\b`, 'gi');
to replace whole words only (but only if these words have no special chars, otherwise, you would need other types of boundaries).