I need a REGEX expression in PLSQL that will extract a substring from a string that contains the delimiter inside the string 0 to multiple times. The string may not contain the delimiter or it may contain multiple instances of the delimiter. The delimiter in my case is the backslash (/).
For example, for vars:
Text1 = 'CALLBACK NUMBER: 999-999-9999 /SEGMENT: DENTAL. ASSOCIATE RESOLUTION:
/PARTICIPATION: PAR PER PROVIDER /DATES OF SERVICE: UNCLEAR /TOOTH HISTORY GIVEN: YES
/BENEFITS NEEDED: N/A/CLAIM NUMBER: N/A /TRANSFERRED: N/A '
Text2 = 'CALLBACK NUMBER: 999-999-9999 /SEGMENT: DENTAL. ASSOCIATE RESOLUTION:
/PARTICIPATION: PAR PER PROVIDER /DATES OF SERVICE: 12/22/2021 /TOOTH HISTORY GIVEN: YES
/BENEFITS NEEDED: N/A/CLAIM NUMBER: N/A /TRANSFERRED: N/A '
Text3 = 'CALLBACK NUMBER: 999-999-9999 /SEGMENT: DENTAL. ASSOCIATE RESOLUTION:
/PARTICIPATION: PAR PER PROVIDER /DATES OF SERVICE: 12/22/2021, 3/5/2022 /TOOTH HISTORY
GIVEN: YES /BENEFITS NEEDED: N/A/CLAIM NUMBER: N/A /TRANSFERRED: N/A '
the desired result:
Text1_result = 'Unclear'
Text2_result = '12/25/2021'
Text3_result = '12/25/2021, 3/5/2022'
I used REGEXP_SUBSTR to extract the info this far but I'm having trouble creating a function that can handle multiple instances of the delimiter INSIDE the string its delimiting
EDIT: updated text inputs. Also, the field names behind the DATES OF SERVICE field will not be constant. So I can't delimit it on 'TOOTH' either.
Thank you.
CodePudding user response:
As mentioned in the comments: divide the strings into 3 groups:
- Everything up to '/DATES OF SERVICE: '
- what you want to extract
- Evertything including and after ' /TOOTH HISTORY'
use REGEXP_REPLACE
backreference to only display the 2nd group:
WITH test_data (s) AS
(
SELECT 'CALLBACK NUMBER: 999-999-9999 /SEGMENT: DENTAL. ASSOCIATE RESOLUTION: /PARTICIPATION: PAR PER PROVIDER /DATES OF SERVICE: UNCLEAR /TOOTH HISTORY GIVEN: YES /BENEFITS NEEDED: N/A/CLAIM NUMBER: N/A /TRANSFERRED: N/A' FROM DUAL UNION ALL
SELECT 'CALLBACK NUMBER: 999-999-9999 /SEGMENT: DENTAL. ASSOCIATE RESOLUTION: /PARTICIPATION: PAR PER PROVIDER /DATES OF SERVICE: 12/22/2021 /TOOTH HISTORY GIVEN: YES /BENEFITS NEEDED: N/A/CLAIM NUMBER: N/A /TRANSFERRED: N/A' FROM DUAL UNION ALL
SELECT 'CALLBACK NUMBER: 999-999-9999 /SEGMENT: DENTAL. ASSOCIATE RESOLUTION: /PARTICIPATION: PAR PER PROVIDER /DATES OF SERVICE: 12/22/2021, 3/5/2022 /TOOTH HISTORY GIVEN: YES /BENEFITS NEEDED: N/A/CLAIM NUMBER: N/A /TRANSFERRED: N/A ' FROM DUAL
)
select
REGEXP_REPLACE(
s
,'^(.*/DATES OF SERVICE: )(. )( /TOOTH HISTORY. )'
,'\2')
from test_data;
Unclear
12/25/2021
12/25/2021, 3/5/2022
Adjust to your actual code as needed.
--update--
To allow for other strings than "/TOOTH", the following change is needed. It makes the second subexpression non-greedy and in the third expression checks for a single whitespace, followed by a forward slash and then any character:
select
REGEXP_REPLACE(
s
,'^(.*/DATES OF SERVICE: )(. ?)([[:space:]]{1}/.*)'
,'\2')
from test_data;