Home > Blockchain >  Expanding Oracle rows with comma-delimited values into multiple rows
Expanding Oracle rows with comma-delimited values into multiple rows

Time:02-22

I have a table in Oracle like the following:

KEY,VALS
k1,"a,b"

I need it to look like:

KEY,VAL
k1,a
k1,b

I did this with CONNECT BY and LEVEL, following an example:

with t as (
    select 'k1' as key, 'a,b' as vals
    from dual
)
select key, regexp_substr(vals, '[^,] ', 1, level) as val
from t
connect by LEVEL <= length(vals) - length(replace(vals, ','))   1

But when I have multiple rows in the table, and the vals can be comma-delimited values of different lengths, like:

KEY,VALS
k1,"a,b"
k2,"c,d,e"

I'm looking for a result like:

KEY,VAL
k1,a
k1,b
k2,c
k2,d
k2,e

But the naive approach above doesn't work because every level is connected with the one above it, resulting in:

with t as (
    select 'k1' as key, 'a,b' as vals
    from dual
    union
    select 'k2' as key, 'c,d,e' as vals
    from dual
)
select key, regexp_substr(vals, '[^,] ', 1, level) as val
from t
connect by LEVEL <= length(vals) - length(replace(vals, ','))   1
KEY,VAL
k1,a
k1,b
k2,e
k2,d
k2,e
k2,c
k1,b
k2,e
k2,d
k2,e

I suspect I need some kind of CONNECT BY PRIOR condition, but I'm not sure what. When trying to match by keys:

connect by prior key = key
       and LEVEL <= length(vals) - length(replace(vals, ','))   1

I get an ORA-01436: CONNECT BY loop in user data error.

What's the right approach here?

CodePudding user response:

Option 1: Simple, fast string functions and a recursive query:

with t (key, vals) as (
    SELECT 'k1', 'a,b'   FROM DUAL UNION ALL
    SELECT 'k2', 'c,d,e' FROM DUAL
),
bounds (key, vals, spos, epos) AS (
  SELECT key, vals, 1, INSTR(vals, ',', 1)
  FROM t
UNION ALL
  SELECT key, vals, epos   1, INSTR(vals, ',', epos   1)
  FROM bounds
  WHERE  epos > 0
)
SEARCH DEPTH FIRST BY key SET key_order
SELECT key,
       CASE epos
       WHEN 0
       THEN SUBSTR(vals, spos)
       ELSE SUBSTR(vals, spos, epos - spos)
       END AS val
FROM   bounds;

Option 2: Slower regular expressions in a LATERAL joined hierarchical query

This option requires Oracle 12 or later.

with t (key, vals) as (
    SELECT 'k1', 'a,b'   FROM DUAL UNION ALL
    SELECT 'k2', 'c,d,e' FROM DUAL
)
SELECT key, val
FROM   t
       LEFT OUTER JOIN LATERAL (
         SELECT regexp_substr(vals, '[^,] ', 1, level) AS val
         FROM   DUAL
         CONNECT BY LEVEL <= REGEXP_COUNT(vals, '[^,] ')
       )
       ON (1 = 1)

Option 3: Recursive query correlating to parent rows.

This option is the slowest of the options as it needs to correlate between levels of the hierarchy and generate a GUID at each step (which is seemingly useless but prevents unnecessary recursion).

with t (key, vals) as (
    SELECT 'k1', 'a,b'   FROM DUAL UNION ALL
    SELECT 'k2', 'c,d,e' FROM DUAL
)
SELECT key,
       regexp_substr(vals, '[^,] ', 1, level) AS val
FROM   t
CONNECT BY LEVEL <= REGEXP_COUNT(vals, '[^,] ')
AND PRIOR key = key
AND PRIOR SYS_GUID() IS NOT NULL;

Which all output:

KEY VAL
k1 a
k1 b
k2 c
k2 d
k2 e

db<>fiddle here

  • Related