I am using regexp_like in Impala with a negative lookbehind to find a pattern in a string array. I've built the expression as follows against a sample data set. Running it yields the following error message.
Invalid regex expression: '(?<=Hello). '
regexp_like(string_field,'(?<!Hello). ')
result | string_field |
---|---|
no match | Hello World, Bye World |
match | Cool, Not Cool |
no match | Cool, Hello, Bye Bye |
This negative lookbehind works in python. Has anyone else come across this? I've tried looking at the documentation but didn't find anything particularly useful.
A better example.
I am trying to find at least one occurrence from a comma separated string array in which at least one of the array elements is not preceded by the keyword e.g. - ('Hello'). A negative lookaround seems like one of the most elegant solutions for the task at hand.
CodePudding user response:
A little clunky, but this works:
regexp_like(string_field, '(^|,)([^H]|H[^e]|He[^l]|Hel[^l]|Hell[^o])')
See live demo.