Home > OS >  I'm attempting to parse json data from zendesk using v: structure
I'm attempting to parse json data from zendesk using v: structure

Time:06-01

With standard fields, like id, this works perfectly. But I am not finding a way to parse the custom fields where the structure is

"custom_fields": [
    {
      "id": 57852188,
      "value": ""
    },
    {
      "id": 57522467,
      "value": ""
    },
    {
      "id": 57522487,
      "value": ""
    }
]

The general format that I have been using is:

 Select v:id,v:updatedat
 from zd_tickets

updated data:

{ 
    "id":151693, 
    "brand_id": 36000, 
    "created_at": "2022-0523T19:26:35Z", 
    "custom_fields": [
        { "id": 57866008, "value": false }, 
        { "id": 360022282754, "value": "" },
        { "id": 80814087, "value": "NC" } ], 
    "group_id": 36000770 
} 

CodePudding user response:

If you want to select all repeating elements you will need to use FLATTEN, otherwise you can use standard notation. This is all documented here: https://docs.snowflake.com/en/user-guide/querying-semistructured.html#retrieving-a-single-instance-of-a-repeating-element

CodePudding user response:

So using this CTE to access the data in a way that look like a table:

with data(json) as (
    select parse_json(column1) from values
    ('{ 
    "id":151693, 
    "brand_id": 36000, 
    "created_at": "2022-0523T19:26:35Z", 
    "custom_fields": [
        { "id": 57866008, "value": false }, 
        { "id": 360022282754, "value": "" },
        { "id": 80814087, "value": "NC" } ], 
    "group_id": 36000770 
} ')
)

SQL to unpack the top level items, as you have shown you have working:

select
    json:id::number as id
    ,json:brand_id::number as brand_id
    ,try_to_timestamp(json:created_at::text, 'yyyy-mmddThh:mi:ssZ') as created_at
    ,json:custom_fields as custom_fields
from data;

gives:

ID BRAND_ID CREATED_AT CUSTOM_FIELDS
151693 36000 2022-05-23 19:26:35.000 [ { "id": 57866008, "value": false }, { "id": 360022282754, "value": "" }, { "id": 80814087, "value": "NC" } ]

So now how to tackle that json/array of custom_fields..

Well if you only ever have 3 values, and the order is always the same..

select
    to_array(json:custom_fields) as custom_fields_a
    ,custom_fields_a[0] as field_0
    ,custom_fields_a[1] as field_1
    ,custom_fields_a[2] as field_2
from data;

gives:

CUSTOM_FIELDS_A FIELD_0 FIELD_1 FIELD_2
[ { "id": 57866008, "value": false }, { "id": 360022282754, "value": "" }, { "id": 80814087, "value": "NC" } ] { "id": 57866008, "value": false } { "id": 360022282754, "value": "" } { "id": 80814087, "value": "NC" }

so we can use flatten to access those objects, which makes "more rows"

select
    d.json:id::number as id
    ,d.json:brand_id::number as brand_id
    ,try_to_timestamp(d.json:created_at::text, 'yyyy-mmddThh:mi:ssZ') as created_at
    ,f.*
from data as d
    ,table(flatten(input=>json:custom_fields)) f
ID BRAND_ID CREATED_AT SEQ KEY PATH INDEX VALUE THIS
151693 36000 2022-05-23 19:26:35.000 1 [0] 0 { "id": 57866008, "value": false } [ { "id": 57866008, "value": false }, { "id": 360022282754, "value": "" }, { "id": 80814087, "value": "NC" } ]
151693 36000 2022-05-23 19:26:35.000 1 [1] 1 { "id": 360022282754, "value": "" } [ { "id": 57866008, "value": false }, { "id": 360022282754, "value": "" }, { "id": 80814087, "value": "NC" } ]
151693 36000 2022-05-23 19:26:35.000 1 [2] 2 { "id": 80814087, "value": "NC" } [ { "id": 57866008, "value": false }, { "id": 360022282754, "value": "" }, { "id": 80814087, "value": "NC" } ]

So we can pull out know values (a manual PIVOT)

select
    d.json:id::number as id
    ,d.json:brand_id::number as brand_id
    ,try_to_timestamp(d.json:created_at::text, 'yyyy-mmddThh:mi:ssZ') as created_at
    ,max(iff(f.value:id=80814087, f.value:value::text, null)) as v80814087
    ,max(iff(f.value:id=360022282754, f.value:value::text, null)) as v360022282754
    ,max(iff(f.value:id=57866008, f.value:value::text, null)) as v57866008
from data as d
    ,table(flatten(input=>json:custom_fields)) f
group by 1,2,3, f.seq

grouping by the f.seq means if you have many "rows" of input these will be kept apart, even if they share common values for 1,2,3

gives:

ID BRAND_ID CREATED_AT V80814087 V360022282754 V57866008
151693 36000 2022-05-23 19:26:35.000 NC <empty string> false

Now if you do not know the names of the values, there is no way short of dynamic SQL and double parsing to turns rows into columns.

  • Related