Home > database >  Remove Duplicate nested phrases from a string with Oracle Regexp_replace
Remove Duplicate nested phrases from a string with Oracle Regexp_replace

Time:09-16

as part of a PL/SQL script, we are setting a Varchar variable called 'V_COMMENT' as such:

V_COMMENT := INCOMING_COMMENT || '[' || V_COMMENT || ']';

because this statement is run potentially multiple times for each record, we will eventually end up with a comment like:

process 578 [process 456[process 123]]

There is an issue, when re running this script, sometimes a duplicate tag can be added as such:

process 123 [process 123 [process 000]]

or

process 456 [process 123 [process 123]]

Where a comment is repeated, and nested.

Is there a way to use Regexp_Replace() to remove duplicate tags, nested or otherwise?

OR ALTERNATIVELY

Is there a way to catch occurrences of these errant comments with regexp_like

CodePudding user response:

You can use solution below to remove duplicates comments. in fact, I use dynamic sql within pl/sql to solve the problem.

DECLARE
v_comment CLOB ;
v_comment2 CLOB ; 
v_sql CLOB;

BEGIN
--V_COMMENT := INCOMING_COMMENT || '[' || V_COMMENT || ']'; 
v_comment := 'process 456[process 123[process 123[process 456[process 000]]]]' ;

v_sql := q'{
with v_temp1 as (
  select '}'||v_comment||q'{' as comm0 from dual
  )
 , v_temp2 AS (
  select comm0
       , LEVEL lvl
       , regexp_count(comm0, '[^\[] ') cnt
       , trim(rtrim( regexp_substr(comm0, '[^\[] ', 1, LEVEL), ']' )) AS comm1
       , row_number()OVER(PARTITION BY comm0, trim(rtrim( regexp_substr(comm0, '[^\[] ', 1, LEVEL), ']' ))  ORDER BY LEVEL) rnb
  from v_temp1
  CONNECT BY LEVEL <= regexp_count(comm0, '[^\[] ')
 )
  SELECT listagg(comm1, '[') WITHIN GROUP (ORDER BY lvl) || 
         LPAD(']', regexp_count(listagg(comm1, '[')WITHIN GROUP (ORDER BY lvl), '\['), ']') comm2
  FROM v_temp2
  WHERE rnb = 1
}'
;

--dbms_output.put_line(v_sql); --test

execute immediate v_sql into v_comment2
;

dbms_output.put_line('input v_comment : ' ||v_comment ); --test
dbms_output.put_line('output v_comment2 : '||v_comment2); --test
END;
/

demo : pl/sql

Here is the sql part of my solution :

with v_temp1 as (
  select 'process 578 [process 456 [process 123]]' comm0 from dual union all
  select 'process 123 [process 123 [process 000]]' from dual union ALL
  select 'process 456 [process 123 [process 123 [process 456 ]]]' from dual
  )
 , v_temp2 AS (
  select comm0
       , LEVEL lvl
       , regexp_count(comm0, '[^\[] ') cnt
       , trim(rtrim( regexp_substr(comm0, '[^\[] ', 1, LEVEL), ']' )) AS comm1
       , row_number()OVER(PARTITION BY comm0, trim(rtrim( regexp_substr(comm0, '[^\[] ', 1, LEVEL), ']' ))  ORDER BY LEVEL) rnb
  from v_temp1
  CONNECT BY LEVEL <= regexp_count(comm0, '[^\[] ')
  /*You need to add the following two conditions when processing more than one row*/
  AND PRIOR comm0 = comm0
  AND PRIOR sys_guid() IS NOT NULL
 )
  SELECT comm0, listagg(comm1, '[') WITHIN GROUP (ORDER BY lvl) || 
         LPAD(']', regexp_count(listagg(comm1, '[')WITHIN GROUP (ORDER BY lvl), '\['), ']') comm2
  FROM v_temp2
  WHERE rnb = 1
  GROUP BY comm0
;

demo : sql

  • Related