I have a table in Oracle like the following:
KEY,VALS
k1,"a,b"
I need it to look like:
KEY,VAL
k1,a
k1,b
I did this with CONNECT BY
and LEVEL
, following an example:
with t as (
select 'k1' as key, 'a,b' as vals
from dual
)
select key, regexp_substr(vals, '[^,] ', 1, level) as val
from t
connect by LEVEL <= length(vals) - length(replace(vals, ',')) 1
But when I have multiple rows in the table, and the vals
can be comma-delimited values of different lengths, like:
KEY,VALS
k1,"a,b"
k2,"c,d,e"
I'm looking for a result like:
KEY,VAL
k1,a
k1,b
k2,c
k2,d
k2,e
But the naive approach above doesn't work because every level is connected with the one above it, resulting in:
with t as (
select 'k1' as key, 'a,b' as vals
from dual
union
select 'k2' as key, 'c,d,e' as vals
from dual
)
select key, regexp_substr(vals, '[^,] ', 1, level) as val
from t
connect by LEVEL <= length(vals) - length(replace(vals, ',')) 1
KEY,VAL
k1,a
k1,b
k2,e
k2,d
k2,e
k2,c
k1,b
k2,e
k2,d
k2,e
I suspect I need some kind of CONNECT BY PRIOR condition, but I'm not sure what. When trying to match by keys:
connect by prior key = key
and LEVEL <= length(vals) - length(replace(vals, ',')) 1
I get an ORA-01436: CONNECT BY loop in user data
error.
What's the right approach here?
CodePudding user response:
Option 1: Simple, fast string functions and a recursive query:
with t (key, vals) as (
SELECT 'k1', 'a,b' FROM DUAL UNION ALL
SELECT 'k2', 'c,d,e' FROM DUAL
),
bounds (key, vals, spos, epos) AS (
SELECT key, vals, 1, INSTR(vals, ',', 1)
FROM t
UNION ALL
SELECT key, vals, epos 1, INSTR(vals, ',', epos 1)
FROM bounds
WHERE epos > 0
)
SEARCH DEPTH FIRST BY key SET key_order
SELECT key,
CASE epos
WHEN 0
THEN SUBSTR(vals, spos)
ELSE SUBSTR(vals, spos, epos - spos)
END AS val
FROM bounds;
Option 2: Slower regular expressions in a LATERAL joined hierarchical query
This option requires Oracle 12 or later.
with t (key, vals) as (
SELECT 'k1', 'a,b' FROM DUAL UNION ALL
SELECT 'k2', 'c,d,e' FROM DUAL
)
SELECT key, val
FROM t
LEFT OUTER JOIN LATERAL (
SELECT regexp_substr(vals, '[^,] ', 1, level) AS val
FROM DUAL
CONNECT BY LEVEL <= REGEXP_COUNT(vals, '[^,] ')
)
ON (1 = 1)
Option 3: Recursive query correlating to parent rows.
This option is the slowest of the options as it needs to correlate between levels of the hierarchy and generate a GUID at each step (which is seemingly useless but prevents unnecessary recursion).
with t (key, vals) as (
SELECT 'k1', 'a,b' FROM DUAL UNION ALL
SELECT 'k2', 'c,d,e' FROM DUAL
)
SELECT key,
regexp_substr(vals, '[^,] ', 1, level) AS val
FROM t
CONNECT BY LEVEL <= REGEXP_COUNT(vals, '[^,] ')
AND PRIOR key = key
AND PRIOR SYS_GUID() IS NOT NULL;
Which all output:
KEY VAL k1 a k1 b k2 c k2 d k2 e
db<>fiddle here