I have a table called HEADER
, within the table is a specific field called SUMMARY
.
Summary contains an overview of an incident like this fake example:
TYPE: CUSTOMER 1
Address: ,123 S 3456 E,,FAIRYLAND,UT 12345-6789 USA
SITE: 1234-NEVADA-46789
Device Name: ROUTER
INCIDENT TYPE: Down
INCIDENT START TIME: 2022-02-10T08:32:25Z
What I am trying, and failing to do is extract only "Down" after the "INCIDENT TYPE:" line and add this as a new field call "INC_TYPE"
Can someone please help?
CodePudding user response:
You may use REGEXP_SUBSTR
with a capture group:
SELECT HEADER.*, REGEXP_SUBSTR(SUMMARY, 'INCIDENT TYPE: (.*) INCIDENT START TIME:', 1, 1, NULL, 1) AS INC_TYPE
FROM HEADER;
CodePudding user response:
If you have a new line character after the incident type then you can just use a simple capture group:
select h.*,
regexp_substr(h.summary, 'INCIDENT TYPE: (.*)', 1, 1, null, 1) as inc_type
from header h
Based on your example that is probably what you need; but if the values might all be on one line then you can restrict the match based on the following field (assuming that is always there), and including newlines in the match with the 'n' flag:
select h.*,
regexp_substr(h.summary, 'INCIDENT TYPE: (.*)INCIDENT START TIME:', 1, 1, 'n', 1) as inc_type
from header h
... though that will leave the newline character or space as part of the inc_type
, so you would probably want to trim that off:
select h.*,
rtrim(
regexp_substr(h.summary, 'INCIDENT TYPE: (.*)INCIDENT START TIME:', 1, 1, 'n', 1),
chr(32)||chr(13)||chr(10) -- trim trailing space, newline, carriage return
) as inc_type
from header h
CodePudding user response:
You can use REGEXP_SUBSTR
with the multi-line flag to ensure that you match from the start of each line:
SELECT REGEXP_SUBSTR(summary, '^INCIDENT TYPE:\s*(.*)', 1, 1, 'm', 1)
AS inc_type
FROM header;
Which, for the sample data:
CREATE TABLE header (summary) AS
SELECT 'TYPE: CUSTOMER 1
Address: ,123 S 3456 E,,FAIRYLAND,UT 12345-6789 USA
SITE: 1234-NEVADA-46789
Device Name: ROUTER
INCIDENT TYPE: Down
INCIDENT START TIME: 2022-02-10T08:32:25Z' FROM DUAL;
Outputs:
INC_TYPE Down
and:
SELECT REGEXP_SUBSTR(summary, '^TYPE:\s*(.*)', 1, 1, 'im', 1) AS type,
REGEXP_SUBSTR(summary, '^ADDRESS:\s*(.*)', 1, 1, 'im', 1) AS address,
REGEXP_SUBSTR(summary, '^SITE:\s*(.*)', 1, 1, 'im', 1) AS site,
REGEXP_SUBSTR(summary, '^DEVICE NAME:\s*(.*)', 1, 1, 'im', 1)
AS device_name,
REGEXP_SUBSTR(summary, '^INCIDENT TYPE:\s*(.*)', 1, 1, 'im', 1)
AS inc_type,
TO_TIMESTAMP_TZ(
REGEXP_SUBSTR(summary, '^INCIDENT START TIME:\s*(.*)', 1, 1, 'im', 1),
'YYYY-MM-DD"T"HH24:MI:SSTZR'
) AS incident_start_time
FROM header;
Outputs:
TYPE ADDRESS SITE DEVICE_NAME INC_TYPE INCIDENT_START_TIME CUSTOMER 1 ,123 S 3456 E,,FAIRYLAND,UT 12345-6789 USA 1234-NEVADA-46789 ROUTER Down 2022-02-10 08:32:25.000000000 00:00
db<>fiddle here