I have a RegExp like so:
const keywordsClone = [...alert?.keywords];
keywordsClone.sort((a, b) => (a.length > b.length ? -1 : 1));
let transcript = alert?.transcript;
forEach(keywordsClone, (k) => {
const regExpression = new RegExp(`(${k})`, 'ig'); //now it has exact match and case insensitive. But I need to allow . and , also.
transcript = transcript.replace(regExpression, '<span >$&</span>');
});
return transcript;
It is working fine in general. But let's say keywordsClone
has this structure fire
and transcript
has this text structure. Fire
Then it doesn't work here. But transcript has this structure Fire
then it works. So how to allow it with .
and ,
also? Any help here, please?
P.S.
alert?.keywords
is an array of string: string[]
0: "structure fire"
1: "fire showing from the second floor"
P.S.2
alert?.keywords
0: "on fire"
transcript
"This is the Fire Department Fire Department I need towards 8.5 481 8.5. In reference to an unattended fire Attention. Fire Department needs around the pole, 8.5, 8.5 for reports of an entertainment fire."
Then it shows
This is the Fire Department Fire Department I need towards 8.5 481 8.5. In reference to an unattended fire Attenti<span >on. Fire</span> Department needs around the pole, 8.5, 8.5 for reports of an entertainment fire.
The problem here is not an exact match. i.e. it must not select any here.
CodePudding user response:
How about this:
let keywords = keywords_string.split(/(?:[\.,\s]) /);
const regex = new RegExp(`(?<=[\\.,\\s]|^)(${keywords.join("[\\.,\\s] ")})(?=[\\.,\\s]|$)`, 'ig');
const alert = {
transcript: "Something something structure. Fire is showing ... and there is fire, showing from the second floor",
keywords: ["structure fire", "fire showing from the second floor"]
}
const keywordsClone = [...alert?.keywords];
keywordsClone.sort((a, b) => (a.length > b.length ? -1 : 1));
let transcript = alert?.transcript;
keywordsClone.forEach((keywords_string) => {
let keywords = keywords_string.split(/(?:[\.,\s]) /);
const regex = new RegExp(`(?<=[\\.,\\s]|^)(${keywords.join("[\\.,\\s] ")})(?=[\\.,\\s]|$)`, 'ig');
transcript = transcript.replace(regex, '<span >$&</span>');
})
console.log(transcript);
Which outputs this from the example:
something something <span >structure. Fire</span> is showing ... and there is <span >fire, showing from the second floor</span>
Explanation
Essentially, we split each string of keywords by periods, commas and spaces. Then we join them together with [\.,\s]
between each of them. This will escape out the periods, commas and spaces in the transcript.
EDIT: As per PS-2, I added (?<=[\\.,\\s]|^)
and (?=[\\.,\\s]|$)
to ensure that the match doesn't start or end with a part of another word respectively.
CodePudding user response:
If I use vanilla JS syntax, I get what I think is your expected output with this
const tran = obj => {
const keywordsClone = obj?.keywords.split(/[.,?\s ]/);
keywordsClone.sort((a, b) => (a.length > b.length ? -1 : 1));
let transcript = obj?.transcript;
keywordsClone.forEach(k => {
const regExpression = new RegExp(`${k}`, 'ig');
console.log(regExpression)
transcript = transcript.replace(regExpression, '<span >$&</span>');
});
return transcript;
};
console.log(tran({keywords:"structure fire", transcript: "structure. Fire"}))