Home > Enterprise >  REGEX: How to count the occurrences of a word or words in a given string despite the order of its ch
REGEX: How to count the occurrences of a word or words in a given string despite the order of its ch

Time:07-05

Is there a way to use regex to find out how many times a word can occured in a given string despite order of its characters Let's take "NEOTWONE" as our examples which should return a count of "4"

  NEO --> 'ONE' -->  1
  OTW --> 'TWO' -->  1
  TWO --> 'TWO' -->  1
  ONE --> 'ONE' -->  1

This is what I have so far and I couldn't get the regex to work properly.

const nums = ['ZERO','ONE','TWO','THREE','FOUR','FIVE','SIX','SEVEN','EIGHT','NINE'];

function amount(str,count=0) {
  for (const n of nums) {
    RegExp(`\\b[${str}] \\b`,'g').test(n) && count  ;
  }
  return count;
}

console.log(amount('ONE')); // 1
console.log(amount('ONEOTW')); // 2
console.log(amount('ONENO')); // 1
console.log(amount('NEOTWONE')); // 2

As you can see, the 2nd, 3rd & 4th example above did not render the correct outcome which should be:

console.log(amount('ONEOTW')); // 3
console.log(amount('ONENO')); // 2
console.log(amount('NEOTWONE')); // 4

I'm new to regex, any feedback will be greatly appreciated with explanation. Million thanks in advance :)

CodePudding user response:

You'll have to create a huge regular expression for that:

First, create all the unique permutations of all input strings (nums).

Concatenate these into one regular expression, using | as separator, but use look-ahead, so that one character can be part of a match multiple times. So for instance, for "ONE", the regular expression would be:

(?=ONE|OEN|ENO|EON|NEO|NOE)

But then you would also include all permutations of "ZERO" and all the other words.

function* permutations(word) {
    if (word.length <= 1) return yield word;
    for (let i = 0; i < word.length; i  ) {
        for (let perm of permutations(word.slice(0, i)   word.slice(i   1))) {
            yield word[i]   perm;
        }
    }
}

function createRegex(words) {
    const allPermutations = words.flatMap(word => [...new Set(permutations(word))]);
    return RegExp("(?="   allPermutations.join("|")   ")", "g");
}

function countMatches(regex, phrase) {
    return phrase.match(regex)?.length ?? 0;
}

const nums = ['ZERO','ONE','TWO','THREE','FOUR','FIVE','SIX','SEVEN','EIGHT','NINE'];
const regex = createRegex(nums);
for (const test of ['ONE', 'ONEOTW', 'ONENO', 'NEOTWONE']) {
    console.log(test, countMatches(regex, test));
}

Note that for the second test the answer is 3, not 2, since "NEO" also counts.

  • Related