I have the following code snippet in SQL to select the next piece of text after ABC DEF
that's of variable length:
SELECT trim('ABC DEF ' FROM regexp_substr(my_field, 'ABC DEF ([^ ] )')) FROM my_table
Sample Data:
'{random text here} ABC DEF {my_variable_length_keyword} {random text here}'
Expected Output:
{my_variable_length_keyword}
While this works, it only accounts for cases where there is one space after ABC DEF
. How would I deal with cases where there are tabs, new lines, or multiple spaces before the next word?
I've tried:
SELECT trim('ABC DEF ' FROM regexp_substr(my_field, 'ABC DEF\s ([^ ] )')) FROM my_table
But this doesn't yield any result.
Can someone please help me out with this? Thank you!
CodePudding user response:
(?<=ABC DEF)\s \S
or (?<=ABC DEF)\\s \\S
for string literal used in other language.
Trim the leading whitespaces, then you'll get the final output.
Some notes:
(?<=pattern)
is positive lookbehind, it works like normal regex but won't be put into match result, place it before strings that you want to match.- in most platform,
pattern
in lookbehind doesn't support*
and(?<=ABC DEF\s )
is unsupported in most cases, you need to move\s
outside to match whitespaces and trim them later, check Important Notes About Lookbehind - Don't know what pattern
{my_variable_length_keyword}
has though, but it seems it doesn't contain whitespaces,\S
can match one non-whitespace character.
CodePudding user response:
You do not need to use any lookarounds, nor do you need to trim the prefix matched with the regex, since with REGEXP_SUBSTR
you can specify a capturing group and extract just its contents.
SELECT regexp_substr(my_field, 'ABC\\s DEF\\s (\\S )', 1, 1, 'e') FROM my_table
This is the regex demo.
Details:
ABC\s DEF\s (\S )
-ABC
, one or more whitespaces,DEF
, one or more whitespaces (this text is just matched) and then one or more non-whitespace chars are captured into Group 1 (note you need to double escape backslashes as the single backslash is used to form string escape sequences, and to introduce a literal\
it needs doubling)1
- start searching from the first char in the string1
- tells the regex engine to extract the first occurrence of the pattern'e'
- enables extraction of the subexpression (aka capturing group value) from the resulting match.