Home > Mobile >  Unnesting stringified JSON with two sets of double quotes
Unnesting stringified JSON with two sets of double quotes

Time:11-14

I have a python script that polls an API and stores the response in a BQ table with layered, stringified JSON field that looks like so:

"{""inputs"": {""Layer1"": {""Layer2"": {""Layer3"": {""item1"": 0.7, ""item2"": 10.0, ""item3"": 0.14}}}}"

I created the following query to unnest:

  ro.id,
  json_extract_scalar(s,
    '$.item1') AS item1,
  json_extract_scalar(s,
    '$.item2') AS item2,
  json_extract_scalar(s,
    '$.item3') AS item3
FROM
  `project.dataset.table` ro
LEFT JOIN
  UNNEST(json_extract_array(response)) AS r
LEFT JOIN
  UNNEST (json_extract_array("Layer1")) AS o
LEFT JOIN
  UNNEST (json_extract_array("Layer2")) AS sc
LEFT JOIN
  UNNEST (json_extract_array("Layer3")) AS s

The query runs and builds the table as expected, but all of the data from the unnested fields is null.

I do not have control over how the script stores the data, so I can only leverage SQL to solve this problem. Are the two sets of double quotes my main issue here? If so, what is the best way to address that?

CodePudding user response:

Consider below approach

SELECT 
  json_extract_scalar(items,'$.item1') AS item1,
  json_extract_scalar(items,'$.item2') AS item2,
  json_extract_scalar(items,'$.item3') AS item3,
FROM `project.dataset.table` ro,
UNNEST([struct(json_extract(response, '$.inputs.Layer1.Layer2.Layer3') as items)])             

if applied to sample data in your question - output is

enter image description here

  • Related