Home > database >  spit JSON object column values into new columns on snowflake ( SQL )
spit JSON object column values into new columns on snowflake ( SQL )

Time:06-01

I want to split a json column that has multiple values as shown in the picture below into new columns. I'm using snowflake cause that's where the data is located.

like for example , I want name, predicted, level, score all as new column.

JSON column content

CodePudding user response:

You can use the DOC to flatten the JSON data into different columns.

CodePudding user response:

So with this CTE to provide the input/data:

with data as (
SELECT '---
- trackingId: 0
  name: English
  predicated: 6
  level: higher
  score: 6' as subjects
)

lets strip that in parts:

select d.*
    ,s.seq
    ,s.value as raw
    ,trim(trim(s.value,'-'),' ') as trimmed 
    ,split(trimmed,':') as parts
    ,trim(parts[0]) as key
    ,trim(parts[1]) as val
    ,iff(key='name', val,null) as name
    ,iff(key='predicated', val,null) as predicated
    ,iff(key='level', val,null) as level
    ,iff(key='score', val,null) as score
from data as d,
table(split_to_table(d.subjects, '\n')) s
where trimmed <> ''

gives:

SUBJECTS SEQ RAW TRIMMED PARTS KEY VAL NAME PREDICATED LEVEL SCORE
--- - trackingId: 0 name: English predicated: 6 level: higher score: 6 1 - trackingId: 0 trackingId: 0 [ "trackingId", " 0" ] trackingId 0
--- - trackingId: 0 name: English predicated: 6 level: higher score: 6 1 name: English name: English [ "name", " English" ] name English English
--- - trackingId: 0 name: English predicated: 6 level: higher score: 6 1 predicated: 6 predicated: 6 [ "predicated", " 6" ] predicated 6 6
--- - trackingId: 0 name: English predicated: 6 level: higher score: 6 1 level: higher level: higher [ "level", " higher" ] level higher higher
--- - trackingId: 0 name: English predicated: 6 level: higher score: 6 1 score: 6 score: 6 [ "score", " 6" ] score 6 6

So we can then roll that back up:

select seq
    ,max(name) as name
    ,max(predicated) as predicated
    ,max(level) as level
    ,max(score) as score
from (
    select d.*
        ,s.seq
        ,trim(trim(s.value,'-'),' ') as trimmed 
        ,split(trimmed,':') as parts
        ,trim(parts[0]) as key
        ,trim(parts[1]) as val
        ,iff(key='name', val,null) as name
        ,iff(key='predicated', val,null) as predicated
        ,iff(key='level', val,null) as level
        ,iff(key='score', val,null) as score
    from data as d,
    table(split_to_table(d.subjects, '\n')) s
    where trimmed <> ''
)
group by seq;

giving:

SEQ NAME PREDICATED LEVEL SCORE
1 English 6 higher 6
  • Related