Home > Enterprise >  Regex assertions match differently with different spacing
Regex assertions match differently with different spacing

Time:09-01

I'm new to RegEx and trying to learn via MDN. I've made it to assertions and I've run into some confusion with the Lookahead assertion x(?=y).

The example the MDN gives is:

/Jack(?=Sprat|Frost)/ matches "Jack" only if it is followed by "Sprat" or "Frost".

However - when I've tested this, it doesn't work in either case, until I add spaces around = and \. Here are some tests I ran (Repl link)

let regex = /Jack(?=Sprat|Frost)/;

let nurseryRhyme1 = 'Jack Frost is real.' 

let nurseryRhyme2 = 'Jack, this Frost is real.' 

let nurseryRhyme3 = 'Jack Sprat is not real.'


console.log(nurseryRhyme1.match(regex)) // null
console.log(nurseryRhyme2.match(regex)) // null
console.log(nurseryRhyme3.match(regex)) // null 

regex = /Jack(?= Sprat|Frost)/;

nurseryRhyme1 = 'Jack Frost is real.' 

nurseryRhyme2 = 'Jack, this Frost is real.' 

nurseryRhyme3 = 'Jack Sprat is not real.'

console.log(nurseryRhyme1.match(regex)) // null
console.log(nurseryRhyme2.match(regex)) // null 
console.log(nurseryRhyme3.match(regex)) // matches, returns ['Jack']

regex = /Jack(?= Sprat | Frost)/;

nurseryRhyme1 = 'Jack Frost is real.' 

nurseryRhyme2 = 'Jack, this Frost is real.' 

nurseryRhyme3 = 'Jack Sprat is not real.'

console.log(nurseryRhyme1.match(regex)) // matches, returns ['Jack']
console.log(nurseryRhyme2.match(regex)) // null 
console.log(nurseryRhyme3.match(regex)) // matches, returns ['Jack']

The documentation doesn't mention anything about spaces being necessary and lays out the pattern as x(?=y) (no spaces).

In general, is spacing a necessary part of a regex assertion or am I doing something wrong? I've tended to avoid regex in the past because it looks like chaos and I'm trying to get a better sense of tricks/tips to making it work.

Any guidance is appreciated - I couldn't find this specific topic while searching, but if there's a post I'd love to read it!

CodePudding user response:

Every character matters in regular expressions.

/Jack(?=Sprat|Frost)/ matches "Jack" only if it is followed by "Sprat" or "Frost".

They probably meant string like

JackSprat

const reg = /Jack(?=Sprat|Frost)/
const string = 'JackSprat'

console.log(string.match(reg))

CodePudding user response:

here is one way to do it

let regex = /Jack\s*(?=Sprat|Frost)/;

add \s* (zero or more spaces after jack)

let regex = /Jack\s*(?=Sprat|Frost)/;

let nurseryRhyme1 = 'JackFrost is real.' 

let nurseryRhyme2 = 'Jack, this Frost is real.' 

let nurseryRhyme3 = 'Jack Sprat is not real.'

console.log(nurseryRhyme1.match(regex)) // null
console.log(nurseryRhyme2.match(regex)) // null
console.log(nurseryRhyme3.match(regex)) // null

  • Related