Replacing a string over multiple rows-CodePudding

I have a table like this one:

ID  Code         CodeType  CodeExt
1   -S_P&S_A     S_P       A B
1   -S_P&S_A     S_A       C
2   S_P/S_A/S_Z  S_P       A B C
2   S_P/S_A/S_Z  S_A       D
2   S_P/S_A/S_Z  S_Z       E
3   S_P          S_P       A

The expected output is

ID  Code_new
1   -(A B)&(C)
2   (A B C)/(D)/(E)
3   (A)

So I want the column Code to be replaced by CodeExt depending on CodeType.

What I tried is

SELECT ID, REPLACE(Code,CodeType,'(' CodeExt ')') AS Code_new
FROM table

but this doesn't deliver the expected output.

CodePudding user response：

Interesting problem. I think I have a working solution you can try. You might need to augment slightly if you have additional separators not shown in your sample data or probably more complex examples but it should give you something you can build on.

First split the code into separate rows on the delimiter (/ or &) by converting to a json array (this provides a sequence for ordering).

Then replace the code with its new value and re-aggregate, adding back the separator for all but the last row.

with n as (
  select ID, Value, seq, Replace(Value,CodeType,Concat('(',CodeExt,')')) NewValue,
    case when seq < Max(seq) over(partition by ID) then
      case when Code like '%&%' then '&' when Code like '%/%' then '/' else '' end
    end sep
  from t
  cross apply (
    select Value, [Key] seq 
    from OpenJson(Concat('["',Replace(Translate(code,'&/', ',,'),',','","'),'"]'))
  )s
)
select ID, String_Agg(Concat(NewValue, sep),'') within group(order by seq) Code_New
from n 
where Value != NewValue
group by ID;

Result:

See Working Fiddle Demo

CodePudding user response：

Another option is to use a recursive CTE.

The way this works is to number each row per Code, then take the first row and do the replacement, then recursively take all of them until there are none left.

This would be more efficient if each row of replacements is numbered (partitioned by Code), as then you can avoid running the row-numbers each time.

WITH numbered AS (
    SELECT *,
      rn = ROW_NUMBER() OVER (PARTITION BY t.Code ORDER BY t.CodeType),
      IsLast = CASE WHEN LEAD(t.CodeType) OVER (PARTITION BY t.Code ORDER BY t.CodeType) IS NULL THEN 1 END
    FROM t
),
cte AS (
    SELECT
      n.ID,
      n.IsLast,
      Code_new = REPLACE(n.Code, n.CodeType, '('   n.CodeExt   ')'),
      Level = 1
    FROM numbered n
    WHERE n.rn = 1
    
    UNION ALL
    
    SELECT
      cte.ID,
      n.IsLast,
      REPLACE(cte.Code_new, n.CodeType, '('   n.CodeExt   ')'),
      cte.Level   1
    FROM cte
    JOIN numbered n ON n.ID = cte.ID AND n.rn = cte.Level   1
)
SELECT
  cte.ID,
  cte.Code_new
FROM cte
WHERE cte.IsLast = 1;

db<>fiddle