Home > Enterprise >  Ignore periods and commas of RegExp
Ignore periods and commas of RegExp

Time:06-07

I have a RegExp like so:

const keywordsClone = [...alert?.keywords];

  keywordsClone.sort((a, b) => (a.length > b.length ? -1 : 1));

  let transcript = alert?.transcript;

  forEach(keywordsClone, (k) => {
    const regExpression = new RegExp(`(${k})`, 'ig'); //now it has exact match and case insensitive. But I need to allow . and , also.

    transcript = transcript.replace(regExpression, '<span >$&</span>');
  });
    

  return transcript;

It is working fine in general. But let's say keywordsClone has this structure fire and transcript has this text structure. Fire Then it doesn't work here. But transcript has this structure Fire then it works. So how to allow it with . and , also? Any help here, please?

enter image description here

P.S.

alert?.keywords is an array of string: string[]

0: "structure fire"
1: "fire showing from the second floor"

P.S.2

alert?.keywords

0: "on fire"

transcript

"This is the Fire Department Fire Department I need towards 8.5 481 8.5. In reference to an unattended fire Attention. Fire Department needs around the pole, 8.5, 8.5 for reports of an entertainment fire."

Then it shows

This is the Fire Department Fire Department I need towards 8.5 481 8.5. In reference to an unattended fire Attenti<span >on. Fire</span> Department needs around the pole, 8.5, 8.5 for reports of an entertainment fire.

The problem here is not an exact match. i.e. it must not select any here.

enter image description here

CodePudding user response:

How about this:

let keywords = keywords_string.split(/(?:[\.,\s]) /);

const regex = new RegExp(`(?<=[\\.,\\s]|^)(${keywords.join("[\\.,\\s] ")})(?=[\\.,\\s]|$)`, 'ig');

const alert = {
    transcript: "Something something structure. Fire is showing ... and there is fire, showing from the second floor",
    keywords: ["structure fire", "fire showing from the second floor"]
}

const keywordsClone = [...alert?.keywords];

keywordsClone.sort((a, b) => (a.length > b.length ? -1 : 1));

let transcript = alert?.transcript;

keywordsClone.forEach((keywords_string) => {
    let keywords = keywords_string.split(/(?:[\.,\s]) /);

    const regex = new RegExp(`(?<=[\\.,\\s]|^)(${keywords.join("[\\.,\\s] ")})(?=[\\.,\\s]|$)`, 'ig');

    transcript = transcript.replace(regex, '<span >$&</span>');
})

console.log(transcript);

Which outputs this from the example:

something something <span >structure. Fire</span> is showing ... and there is <span >fire, showing from the second floor</span>

Explanation

Essentially, we split each string of keywords by periods, commas and spaces. Then we join them together with [\.,\s] between each of them. This will escape out the periods, commas and spaces in the transcript.

EDIT: As per PS-2, I added (?<=[\\.,\\s]|^) and (?=[\\.,\\s]|$) to ensure that the match doesn't start or end with a part of another word respectively.

CodePudding user response:

If I use vanilla JS syntax, I get what I think is your expected output with this

const tran = obj => {
  const keywordsClone = obj?.keywords.split(/[.,?\s ]/);

  keywordsClone.sort((a, b) => (a.length > b.length ? -1 : 1));
  let transcript = obj?.transcript;

  keywordsClone.forEach(k => {
    const regExpression = new RegExp(`${k}`, 'ig'); 
    console.log(regExpression)
    transcript = transcript.replace(regExpression, '<span >$&</span>');
  });
  return transcript;
};

console.log(tran({keywords:"structure fire", transcript: "structure. Fire"}))

  • Related