Home > Back-end >  SQL Athena: Regular Expression to extract string between multiple parentheses
SQL Athena: Regular Expression to extract string between multiple parentheses

Time:08-19

So I have a table that have store_name column with their location in the parentheses. And what I need is to extract the characters between the parentheses to get their location

store_name
Daniel Bro (Tokyo)
Michael Ole (New Store) (London)

The problem is some row have multiple parentheses, but I know that the location always refer to the last parentheses. Is there a way using RegEx to only extract the strings between the last parentheses.

I've tried using these, but it only extract the string between the first parentheses.

select regexp_extract(store_name, '[^\(\)] (?=\))') as location from table a

The code above returns 'Tokyo' and 'New Store' instead of 'Tokyo' and 'London'. Is there any way to do this?

CodePudding user response:

Add a $ boundary marker to the end of your regex pattern to signify the end of the store name:

SELECT REGEXP_EXTRACT(store_name, '\((.*?)\)$', 1) AS location
FROM TableA;

Note that the third parameter 1 passed to REGEXP_EXTRACT says to return the first (and only) capture group.

CodePudding user response:

You may use this .*

This makes use of the fact that the * quantifier is greedy, so the first .* will consume the entire text. After that the regex engine backtracks (i.e. traverses the string backwards from the end) until the last occurrence of ".*" in the string matches.

  • Related