Home > Blockchain >  Regex lookaround to find anything up to an already searched group
Regex lookaround to find anything up to an already searched group

Time:03-26

I'm trying to analyze search queries of a particular pattern.

The pattern is: How many/much _____ is/are _____.

Given this pattern, the blanks are unknown to me but I want to extract any statement that follows this pattern above. My challenge is finding a way to do a lookaround on is/are up to but not including many/much and anything after but not including is/are.

Here's my regex so far:

(([hH]ow many?)|([hH]ow much?))|(?<=is)|(are)|(i|s|n|a|o|f){1,2}|((\")|(\“)|(\/)|(\'))

CodePudding user response:

If you use this regex with the i flag to match case insensitive

^how\s (?:much|many)\s (.*?)\s(?:is|are)\s (.*?)[.?]?$

Then it'll match these strings

How much bla is blabla.
How many bla are blablabla?

And the bla's will be in capture group 1 and 2.

CodePudding user response:

Try this:

/(?<=[Hh]ow\smany\s|[Hh]ow\smuch\s)(. )(?=\sis|\sare)|(?<=is\s|are\s)(. )/g

Review it at regex101

Lookarounds are placed behind and/or ahead of your capture group:

1st Capture Group

(?<=[Hh]ow\smany\s|[Hh]ow\smuch\s) /* "(H|h)ow"\space"many"\space OR 
                                      "(H|h)ow"\space"much"\space 
                                      must be before capture group */
(. )               /* capture group one or more of anything */
(?=\sis|\sare)     /* \space"is" OR \space"are" must be after capture group */
|                  // OR

2nd Capture Group

(?<=is\s|are\s)    /* "is"\space OR "are\space must be before capture group */
(. )               /* capture group one or more of anything */
  • Related