I have this text:
Process explanation: plasma gasification with 50% alcohol added. Final activity for manager: some activity
.
I just wanna grab this part: plasma gasification with 50% alcohol added
. Sometimes this exist: Final activity for manager: some activity
and sometimes it doesn't. The part extracted may contain any special characters: plasma gasification with 50% alcohol added
.
I just want to grab any part after Process explanation
but not including Final activity..
I have this so far:
select regexp_substr('Process explanation: plasma gasification combined with centrifugal activity. Final activity for manager: some activity',
'(.*process[[:space:]]*(explanation)?[[:space:]]*:[[:space:]]*)(.*)([[:space:]]*final[[:space:]]*activity[[:space:]]*for[[:space:]]*manager.*$)?',
1, 1, 'i', 3) as extracted from dual;
CodePudding user response:
You can use
Process\s explanation\s*:\s*(.*?)(\.\s*Final activity|$)
CodePudding user response:
You do not need (slow) regular expressions and can use simple string functions:
SELECT CASE
WHEN start_pos = 0
THEN NULL
WHEN end_pos = 0
THEN SUBSTR(value, start_pos start_len)
ELSE SUBSTR(value, start_pos start_len, end_pos - start_pos - start_len)
END AS match
FROM (
SELECT value,
INSTR(
UPPER(value),
UPPER(match_start),
1
) AS start_pos,
LENGTH(match_start) AS start_len,
INSTR(
UPPER(value),
UPPER(match_end),
INSTR(UPPER(value), UPPER(match_start), 1, 1)
) AS end_pos
FROM table_name
CROSS JOIN (
SELECT 'Process explanation:' AS match_start,
'Final activity' AS match_end
FROM DUAL
)
)
Note: if the sub-strings you want to match are always in the same case then you do not need to use UPPER
.
Note 2: If you want to remove leading and trailing spaces then use the TRIM
function on the output.
Which, for the sample data:
CREATE TABLE table_name (value) AS
SELECT 'Process explanation: plasma gasification with 50% alcohol added. Final activity for manager: some activity' FROM DUAL;
Outputs:
MATCH plasma gasification with 50% alcohol added.
db<>fiddle here