Home > Back-end >  Regular Expression to exclude a certain pattern
Regular Expression to exclude a certain pattern

Time:11-03

I am trying to find the count of number of occurrences of a substring in a string using

function countOccurences(str,word){
   var regex = new RegExp("\\b" word "\\b","gi");
    console.log((str.match(regex)|| []).length);
}


let String=' test TEST TESTING Test I like testing <h3>TEST</h3> test" ';

let asset="Test";
countOccurences(String,asset);

Here, the result I am getting is 6, which is ok as I want the exact match, but I want to exclude the test of class and id, so that the result I get is 4 and not 6.

CodePudding user response:

You can form a regex that will match and capture the substrings you would like to skip when counting and match the asset in other contexts only.

The sample regex may look like

/\b((?:id|class)="Test\b)|\bTest\b/gi

This will match

  • \b((?:id|class)="Test\b) - word boundary, Group 1 capturing id or class, then ="Test as a whole word
  • | - or
  • \bTest\b - whole word Test

See the JavaScript demo:

function countOccurences(str,exceptions,word){
   const pattern = "\\b((?:"   exceptions.map(x => x.replace(/[-\/\\^$* ?.()|[\]{}]/g, '\\$&')).join("|")   ')="'   word   "\\b)|\\b" word "\\b";
   const regex = new RegExp(pattern,"gi");
   let count = 0, m;
   while (m = regex.exec(str)) {
       if (!m[1]) {
           count  ;
       }
   }
   console.log(count)
}


let text = ' test TEST TESTING Test I like testing <h3>TEST</h3> test" ';

let asset="Test";
let exception_arr = ["id", "class"]
countOccurences(text,exception_arr,asset);
// => 4
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

Another solution is based on the negative lookarounds (not support in all JavaScript environments yet):

function countOccurences(str,exceptions,word){
   const pattern = "\\b(?<!\\b(?:"   exceptions.map(x => x.replace(/[-\/\\^$* ?.()|[\]{}]/g, '\\$&')).join("|")   ')=")' word "\\b";
   const regex = new RegExp(pattern,"gi");
   console.log((str.match(regex) || ['']).length);
}

let text = ' test TEST TESTING Test I like testing <h3>TEST</h3> test" ';

let asset="Test";
let exception_arr = ["id", "class"]
countOccurences(text,exception_arr,asset);
<iframe name="sif2" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

Here, /\b(?<!\b(?:id|class)=")Test\b/gi regex will match any Test as a whole word if it is not immediately preceded iwth id or class as whole words followed with =" substring.

  • Related