Home > Software engineering >  RegEx: Detect string interpolation but not inside attribute
RegEx: Detect string interpolation but not inside attribute

Time:12-21

I am working on creating Web Components and I need a Regular Expression that captures instances of string interpolation in a template sting.
For example with the following string:

<img src="${this.image}"/><h5>${this.title}</h5><p>${this.description}</p>

The instances of string interpolation are inside ${} and can be captured with: (this(\.\w )).
But I do not want to capture the first instance because it is inside an attribute.

I have tried the expression ((?<!". )this(\.\w ) (?!. ")) which works with a multiline string (each tag on own line) but now on a single line.

Here is my RegExr demo.
Perhaps someone with more exp in RegEx can help me out.

CodePudding user response:

I think this should work for you:

[^"]\$\{(this\.\w )

This will only take interpolations that are not preceded by "

CodePudding user response:

Use the following regex:

[^="]{2}\${(\S ?)}

  1. Attributes always will have a = and their value will be in quotes. So [^="]{2} ensures that we match the two characters that are anything but = and ".
  2. (\S ?) then lazily captures the required data in a capturing group.

Demo

CodePudding user response:

You can use a negative lookbehind to account for a quoted attribute: ?<!=["'])\$\{this(?:\.\w ) \}. This will exclude the src="${this.image}" in your example, but you'll get a false positive for HTML text, such as <p>Quote: "${this.quote}"</p>

You can use a negative lookbehind to account for a quoted attribute in an HTML tag: (?<!<\w (\w =["'][^"']*["'] )*\w =["'])\$\{this(?:\.\w ) \}.

Here is an example with both regexes:

const regex1 = /(?<!["'])\$\{this(?:\.\w ) \}/g;
const regex2 = /(?<!<\w  (\w =["'][^"']*["'] )*\w =["'])\$\{this(?:\.\w ) \}/g;

[
  '<img src="${this.image}"/><h5>${this.title}</h5><p>${this.description}</p><p>Quote: "${this.quote}"</p>',
  '<img foo="bar" src="${this.image}"/><h5>${this.title}</h5><p>${this.description}</p><p>Quote: "${this.quote}"</p>'
].forEach(str => {
  console.log(str);
  console.log('- regex1:', str.match(regex1));
  console.log('- regex2:', str.match(regex2));
});

Explanation of regex2:

  • (?<! -- negative lookbehind start
  • <\w -- start of HTML tag and space <img
  • (\w =["'][^"']*["'] )* -- 0 attributes of form attr="value" , with trailing space
  • \w =["'] -- attribute start, such as src=" or src='
  • ) -- negative lookbehind end
  • \$\{this -- literal ${this
  • (?:\.\w ) -- non-capture group for 1 patterns of .something
  • \} -- literal }

Note: If your regex engine does not support negative lookbehind (notably Safari) you can change that to a capture group, and restore it with a .replace()

  • Related