Home > Back-end >  AWS Athens regexp_extract
AWS Athens regexp_extract

Time:04-23

I am trying to extract a part of the string 'c://abcd /abcdef/0012wetr_1234567890.csv' between the last '/' and '_' characters.

0012wetr

I am able to extract everything after the last '/' character

select regexp_extract('c://abcd /abcdef/0012wetr_1234567890.csv', '([^/]*)$');

0012wetr_1234567890.csv

Unfortunately I am stuck and don't know how to split it further.

Your help would be appreciated. Cheers, A.

CodePudding user response:

Maybe it is overkill but I managed to get required result using next combination of lookaheads - (?!\/)[^\/] (?=_):

select regexp_extract('c://abcd /abcdef/0012wetr_1234567890.csv', '(?!\/)[^\/] (?=_)');

Output:

_col0
0012wetr

regex101.com

CodePudding user response:

You can use a REGEXP_REPLACE approach:

REGEXP_REPLACE('c://abcd /abcdef/0012wetr_1234567890.csv', '.*/([^_] ).*', '$1')

See the regex demo.

If you need to keep the result blank if there is no match, add |. at the end of the pattern:

REGEXP_REPLACE('c://abcd /abcdef/0012wetr_1234567890.csv', '.*/([^_] ).*|. ', '$1')

Details:

  • .* - any zero or more chars other than line break chars as many as possible
  • / - a / char
  • ([^_] ) - Group 1: any one or more chars other than _
  • .* - the rest of the line
  • | - or
  • . - any one or more chars other than line break chars as many as possible.
  • Related