Home > Enterprise >  Matching sub string in another string using regex js
Matching sub string in another string using regex js

Time:09-01

I am trying to do exact match check if a sub-string already exists in another string. However, there are some cases that it fails like when searching for something contains ({[

it's called matching word boundary, which can be accomplished using \b

ex: f(x) in the first 3 cases

// check exact match
const escapeRegExpMatch = (s) => {
    return s.replace(/[-\/\\^$* ?.()|[\]{}]/g, '\\$&')
}
const isExactMatch = (str, match) => {
  return new RegExp(`\\b${escapeRegExpMatch(match)}\\b`).test(str)
}

// f(x)
console.log(true, isExactMatch("09.12.06.Inkigayo.f(x) - Chu.tp", "f(x)")) // output: false
console.log(true, isExactMatch("091108.Popular Song f(x) Chu.tp", "f(x)")) // output: false
console.log(true, isExactMatch("090925 Music Bank.F(x).Digital Rank   La cha ta.HDTV.1080i.ts", "f(x)")) // output: false
console.log(true, isExactMatch("06.07.30.Inkigayo.Ayumi - Cutie Honey.ts", "Ayumi"))
console.log(true, isExactMatch("13.10.18.Music Bank.IU - The Red Shoes.ts", "The Red"))
// some failing cases
console.log(false, isExactMatch("13.06.16.Inkigayo.Sistar - Intro & Give It to Me.ts", "ive"))
console.log(false, isExactMatch("13.06.16.Inkigayo.Sistar - Intro & Give It to Me.ts", "sis"))
console.log(false, isExactMatch("13.06.16.WOWOW LIVE.SNSD Girls & Peace~ Japan 2nd Tour - GEE.ts", "ieve"))
console.log(false, isExactMatch("14.06.27.SBS Sports2014 FIFA World Cup Cheering Event.Rainbow - Tell Me Tell Me & A.ts", "s2"))


CodePudding user response:

You can use String.includes(), that's can help too, but if you want to use Regex for specific reason, the following code must work.

// check exact match
const escapeRegExpMatch = (s) => {
  return s.replace(/[-\/\\^$* ?.()|[\]{}]/g, '\\$&')
}
const isExactMatch = (str, match) => {
  return (new RegExp(escapeRegExpMatch(match).trim(), 'gi')).test(str);
}

console.log(isExactMatch("09.12.06.Inkigayo.f(x) - Chu.tp", "f(x)"))
console.log(isExactMatch("091108.Popular Song f(x) Chu.tp", "f(x)"))
console.log(isExactMatch("090925 Music Bank.F(x).Digital Rank   La cha ta.HDTV.1080i.ts", "f(x)")) // output: false
console.log(isExactMatch("06.07.30.Inkigayo.Ayumi - Cutie Honey.ts", "Ayumi"))
console.log(isExactMatch("13.10.18.Music Bank.IU - The Red Shoes.ts", "The Red"))

CodePudding user response:

The main point is that a \b word boundary is a context-dependent construct, and sometimes the keyword isn't always a context-dependent however it could contain numbers for example and special chars like [, so unambiguous word boundaries is needed.

function isExactMatch(str, keyword) {
    let re_pattern = `(?:^|\\W)(${keyword.replace(/[-\/\\^$* ?.()|[\]{}]/g, '\\$&')})(?!\\w)`
    return new RegExp(re_pattern, "i").test(str)
}

  • Related