Home > Software engineering >  Generate rows from string of numbers
Generate rows from string of numbers

Time:04-29

I have an Oracle 18c table that has strings like this:

select
    '((0 5 0, 10 10 11.18, 30 0 33.54),(50 10 33.54, 60 10 43.54))' as multipart_lines
    --There are more rows in the actual table.
from
    dual

    MULTIPART_LINES                                              
    -------------------------------------------------------------
    ((0 5 0, 10 10 11.18, 30 0 33.54),(50 10 33.54, 60 10 43.54))
--      v1      v2           v3           v4           v5
--  |            part 1             | |          part 2         | 
  • Individual coordinates are separated by spaces.
  • Vertices (X Y Z coordinates) are separated commas.
  • Line parts are wrapped in brackets and separated by commas.

In a query, I want to generate rows for each vertex:

PART_NUM   VERTEX_NUM          X          Y          Z
---------- ---------- ---------- ---------- ----------
         1          1          0          5          0
         1          2         10         10      11.18
         1          3         30          0      33.54
         2          1         50         10      33.54
         2          2         60         10      43.54
  • I want to do this in a query. I don't want to insert rows into a table.
  • Unfortunately, I don't have CREATE TYPE privileges in the database. But I can create functions (and of course, inline functions are an option too).

How can I generate rows from the numbers (vertices) in the string?

Related: Oracle Ideas - generate_series() function

CodePudding user response:

It would be so much simpler if the input was in some standard format - for example JSON. Then the task would be trivial. Do you have any power over that?

If not, you could either transform the input into proper JSON (or similar), or you could attack the problem directly. I illustrate the latter below, assuming Oracle version 12.1 or higher.

with
  inputs (id, multipart_lines) as (
    select 2810,
      '((0 5 0, 10 10 11.18, 30 0 33.54),(50 10 33.54, 60 10 43.54))'
                                  from dual union all
    select 7284, '((-2.3 0.2 3))' from dual
  )
select id, part_num, vertex_num, x, y, z
from   inputs 
       cross join lateral
       ( select  level as part_num,
                 regexp_substr(multipart_lines,
                               '\(([^()] )\)', 1, level, null, 1) as part
         from    dual
         connect by level <= regexp_count(multipart_lines, '\(') - 1
       )
       cross join lateral
       (
         select  level as vertex_num,
                 regexp_substr(part, '[^,] ', 1, level) as vertex
         from    dual
         connect by level <= regexp_count(part, ',')   1
       )
       cross join lateral
       (
         select   to_number(regexp_substr(vertex, '[^ ] ', 1, 1)) as x,
                  to_number(regexp_substr(vertex, '[^ ] ', 1, 2)) as y,
                  to_number(regexp_substr(vertex, '[^ ] ', 1, 3)) as z
         from     dual
       )
order by id, part_num, vertex_num   --  if needed
;

Output (from the sample inputs I included in the query):

        ID   PART_NUM VERTEX_NUM          X          Y          Z
---------- ---------- ---------- ---------- ---------- ----------
      2810          1          1          0          5          0
      2810          1          2         10         10      11.18
      2810          1          3         30          0      33.54
      2810          2          1         50         10      33.54
      2810          2          2         60         10      43.54
      7284          1          1       -2.3         .2          3

CodePudding user response:

I see that mathguy has got there. I was working with from json_table but I can't unnest 2 rows at a time otherwise I'd be basically there with a second use of row_number() over (partition by Paru_num).

create table sample(value varchar(100));
insert into sample values 
('((0 5 0, 10 10 11.18, 30 0 33.54),(50 10 33.54, 60 10 43.54))')

1 rows affected

with a as (
select '["'|| regexp_replace(value,'\(|\)','') ||'"]' a from sample
),
b as (
select regexp_replace(a,', ?','","') b from a
),
c as (
SELECT value c
FROM json_table( (select b from b) , '$[*]'
                COLUMNS (value PATH '$')
               )
),
d as (
SELECT
  c d,
  instr(c,' ') s1,
  instr(c,' ',instr(c,' ') 1) s2
from c)
select 
 substr(d,0,s1) x,
 substr(d,s1 1,s2-s1) y,
 substr(d,s2 1) z
from d
X   | Y   | Z    
:-- | :-- | :----
0   | 5   | 0    
10  | 10  | 11.18
30  | 0   | 33.54
50  | 10  | 33.54
60  | 10  | 43.54
with a as (
select '["'|| regexp_replace(value,'\( |\) ','"') ||'"]' a from sample
),
b as(
select replace(a,'""','"')b from a
),
c as (
SELECT 
  row_number() over (order by 'zero') pn,
  value c
FROM json_table( (select b from b) , '$[*]'
                COLUMNS (value PATH '$')
)    ),
d as (
select '["'|| pn ||' '|| regexp_replace(c,', ?','","'||pn||' ')||'"]' d from c
)
select * from d
| D                                          |
| :----------------------------------------- |
| ["1 0 5 0","1 10 10 11.18","1 30 0 33.54"] |
| ["2 50 10 33.54","2 60 10 43.54"]          |
with a as (
select '["'|| regexp_replace(value,'\( |\) ','"') ||'"]' a from sample
),
b as(
select replace(a,'""','"')b from a
),
c as (
SELECT 
  row_number() over (order by 'zero') pn,
  value c
FROM json_table( (select b from b) , '$[*]'
                COLUMNS (value PATH '$')
)    ),
d as (
select '["'|| pn ||' '|| regexp_replace(c,', ?','","'||pn||' ')||'"]' d from c
),
e as (
SELECT 
  row_number() over (order by 'zero') pn,
  value c
FROM json_table( (select d from d) , '$[*]'
                COLUMNS (value PATH '$')
)    )
select * from e
ORA-01427: single-row subquery returns more than one row

db<>fiddle here

CodePudding user response:

As an alternative - here is how you can process the input strings to convert them to proper JSON strings; then the task becomes trivial. Showing just the JSON-ization first, separately, as it really is the meaningful part of this solution; then after I show the query and result, I will complete the solution by adding the JSON manipulation.

with
  inputs (id, multipart_lines) as (
    select 2810,
      '((0 5 0, 10 10 11.18, 30 0 33.54),(50 10 33.54, 60 10 43.54))'
                                  from dual union all
    select 7284, '((-2.3 0.2 3))' from dual
  )
, j (id, ml) as (
    select id,
           regexp_replace(
             regexp_replace(
               regexp_replace(
                 regexp_replace(
                   regexp_replace(multipart_lines
                   , '\(\s*\(\s*', '[[[')
                 , '\s*\)\s*\)', ']]]')
               , '\s*\)\s*,\s*\(\s*', ']]@[[')
             , '\s*,\s*', '],[')
           , '\s |@', ',')
    from   inputs
  )
select * from j;



   ID ML                                                                  
----- --------------------------------------------------------------------
 2810 [[[0,5,0],[10,10,11.18],[30,0,33.54]],[[50,10,33.54],[60,10,43.54]]]
 7284 [[[-2.3,0.2,3]]] 

Your inputs should really look like the strings in column ml in my subquery j - then you could process them like this:

with
  inputs (id, multipart_lines) as (
            ........
  )
, j (id, ml) as (
            ........
  )
select id, part_num, vertex_num, x, y, z
from   j,
       json_table(ml, '$[*]'
                  columns (
                    part_num for ordinality,
                    nested path '$[*]'
                    columns (
                      vertex_num for ordinality,
                      x number path '$[0]',
                      y number path '$[1]',
                      z number path '$[2]'
                    )
                  )
       )
order by id, part_num, vertex_num   --  if needed
;

The output is the same as in my other answer.

  • Related