Home > Net >  PLSQL REGEX extract a string between two of the same delimiters when the delimiter is inside the str
PLSQL REGEX extract a string between two of the same delimiters when the delimiter is inside the str

Time:07-29

I need a REGEX expression in PLSQL that will extract a substring from a string that contains the delimiter inside the string 0 to multiple times. The string may not contain the delimiter or it may contain multiple instances of the delimiter. The delimiter in my case is the backslash (/).

For example, for vars:

Text1 = 'CALLBACK NUMBER: 999-999-9999 /SEGMENT: DENTAL. ASSOCIATE RESOLUTION:  
 /PARTICIPATION: PAR PER PROVIDER /DATES OF SERVICE: UNCLEAR /TOOTH HISTORY GIVEN: YES 
/BENEFITS NEEDED: N/A/CLAIM NUMBER: N/A /TRANSFERRED: N/A '

Text2 = 'CALLBACK NUMBER: 999-999-9999 /SEGMENT: DENTAL. ASSOCIATE RESOLUTION:  
 /PARTICIPATION: PAR PER PROVIDER /DATES OF SERVICE: 12/22/2021 /TOOTH HISTORY GIVEN: YES 
/BENEFITS NEEDED: N/A/CLAIM NUMBER: N/A /TRANSFERRED: N/A '

Text3 = 'CALLBACK NUMBER: 999-999-9999 /SEGMENT: DENTAL. ASSOCIATE RESOLUTION:  
 /PARTICIPATION: PAR PER PROVIDER /DATES OF SERVICE: 12/22/2021, 3/5/2022 /TOOTH HISTORY 
GIVEN: YES /BENEFITS NEEDED: N/A/CLAIM NUMBER: N/A /TRANSFERRED: N/A '

the desired result:

Text1_result = 'Unclear'
Text2_result = '12/25/2021'
Text3_result = '12/25/2021, 3/5/2022'

I used REGEXP_SUBSTR to extract the info this far but I'm having trouble creating a function that can handle multiple instances of the delimiter INSIDE the string its delimiting

EDIT: updated text inputs. Also, the field names behind the DATES OF SERVICE field will not be constant. So I can't delimit it on 'TOOTH' either.

Thank you.

CodePudding user response:

As mentioned in the comments: divide the strings into 3 groups:

  • Everything up to '/DATES OF SERVICE: '
  • what you want to extract
  • Evertything including and after ' /TOOTH HISTORY'

use REGEXP_REPLACE backreference to only display the 2nd group:

WITH test_data (s) AS
(
SELECT 'CALLBACK NUMBER: 999-999-9999 /SEGMENT: DENTAL. ASSOCIATE RESOLUTION: /PARTICIPATION: PAR PER PROVIDER /DATES OF SERVICE: UNCLEAR /TOOTH HISTORY GIVEN: YES /BENEFITS NEEDED: N/A/CLAIM NUMBER: N/A /TRANSFERRED: N/A' FROM DUAL UNION ALL
SELECT 'CALLBACK NUMBER: 999-999-9999 /SEGMENT: DENTAL. ASSOCIATE RESOLUTION: /PARTICIPATION: PAR PER PROVIDER /DATES OF SERVICE: 12/22/2021 /TOOTH HISTORY GIVEN: YES /BENEFITS NEEDED: N/A/CLAIM NUMBER: N/A /TRANSFERRED: N/A' FROM DUAL UNION ALL
SELECT 'CALLBACK NUMBER: 999-999-9999 /SEGMENT: DENTAL. ASSOCIATE RESOLUTION: /PARTICIPATION: PAR PER PROVIDER /DATES OF SERVICE: 12/22/2021, 3/5/2022 /TOOTH HISTORY GIVEN: YES /BENEFITS NEEDED: N/A/CLAIM NUMBER: N/A /TRANSFERRED: N/A ' FROM DUAL
)
select 
  REGEXP_REPLACE(
    s
    ,'^(.*/DATES OF SERVICE: )(. )( /TOOTH HISTORY. )'
    ,'\2')
  from test_data;


Unclear
12/25/2021 
12/25/2021, 3/5/2022 

Adjust to your actual code as needed.

--update--

To allow for other strings than "/TOOTH", the following change is needed. It makes the second subexpression non-greedy and in the third expression checks for a single whitespace, followed by a forward slash and then any character:

select 
  REGEXP_REPLACE(
    s
    ,'^(.*/DATES OF SERVICE: )(. ?)([[:space:]]{1}/.*)'
    ,'\2')   
  from test_data;
  • Related