Any substitution for the negative lookahead in regular expression?-CodePudding

I'm using regular expression to extract some country data in BigQuery. And I don't know how to extract the text I want from it. This is the example records I use.

country
China Anhui Univ Chinese Med, Affiliated Hosp 1, Expt Ctr Clin Res, Sci Res Dept, 117 Meishan Rd, Hefei 230031, Anhui, 12, Peoples R China
Meluna Res, Geldermalsen, Netherlands; [Wiegant, Frederik Anton Clemens] Univ Utrecht, Utrecht, Netherlands

I wanted to extract the last comma-followed words Peoples R China, Netherlands from the text, so I used the negative lookahead to extract them.

(, )(?!.*\b\1\b)((\w*\s?){3})

But it seems like BigQuery doesn't support lookahead expressions since they only support RE2. Is there any way I can extract the country name without using lookahead expressions?

CodePudding user response：

You can use

,\s*([^,]*)$

See the regex demo. The pattern matches

, - a comma
\s* - zero or more whitespaces
([^,]*) - capturing group 1: any zero or more chars other than a comma
$ - end of string.