I am having a difficult time trying to return a specific section of string from a field (BSE.NOTES) using REGEXT_SUBSTR. For my query, I have a VARCHAR2 field with a specific text that I would like to return:
Hospital Dept Name Case: SP12-34567 Authorizing Provider: D. K, MD MSCR Collected: 07/09/2021 12:49 PM Ordering Location: Hospital Received: 07/09/2021 03:23 PM Pathologist: D. L., MD Specimens: A) - Body part 1 B) - Body part 2
From this text, I need to return the string "Case: SP-***" for each record. I tried using the following code, but only get NULL values:
REGEXP_SUBSTR(BSE.NOTES, '(\S|^)(Case\:\s)([0-9]\-\[0-9])', 1, 1, NULL) AS CASE_NUMB
I am not very versed and using regexp_substr() so any help is greatly appreciated!
CodePudding user response:
Your pattern is looking for a single numeric digit either side of the -
, and doesn't allow for the 'SP' part; it will also only match after the start of the line or a non-whitespace character, and your parentheses don't look right. And you haven't specified that you want a sub-pattern (read more in the docs).
This will get the value you said you wanted:
select REGEXP_SUBSTR(BSE.NOTES, '(\s|^)(Case:\sSP[0-9] -[0-9] )', 1, 1, null, 2) AS CASE_NUMB
from bse
CASE_NUMB |
---|
Case: SP12-34567 |
Or if you actually only want the second part of that you can add more sub-patterns:
select REGEXP_SUBSTR(BSE.NOTES, '(\s|^)(Case:\s(SP[0-9] -[0-9] ))', 1, 1, null, 3) AS CASE_NUMB
from bse
CASE_NUMB |
---|
SP12-34567 |
Using regular expressions to extract data from fields like this is a bit error-prone though, particularly where it's essentially free-form text that could have odd capitalisation (you can make the third argument 'i'
to help with that), spacing (maybe make the second \s
optional, as \s?
), typos or other oddities... And you may not need the anchor/whitespace at the start, but it might not hurt, particularly if you do go case-insensitive.
CodePudding user response:
Do you mean this?
WITH
bse(notes) AS (
SELECT 'Hospital Dept Name Case: SP12-34567 Authorizing Provider: D. K,'
||' MD MSCR Collected: 07/09/2021 12:49 PM Ordering Location:'
||' Hospital Received: 07/09/2021 03:23 PM Pathologist: D. L.'
||', MD Specimens: A) - Body part 1 B) - Body part 2'
FROM dual
)
SELECT
REGEXP_SUBSTR(
bse.notes -- the string
, 'Case:\s*SP\d\d-\d ' -- the pattern "Case:",
-- zero, one or more whitespace (\s*),
-- "SP", two digits (\d\d), a hyphen,
-- one or more digits (\d )
) AS extr
FROM bse;
-- out extr
-- out ------------------
-- out Case: SP12-34567