Home > Mobile >  BigQuery JSON element extraction
BigQuery JSON element extraction

Time:02-02

I have a table in BigQuery with a JSON column, see below.

doc_id data
222 {...}
333 {...}

The data JSON column looks like the IDs are set as headers.

{
    "1675223776617": {
        "author": "aaa",
        "new": "2023-02-01",
        "old": null,
        "property": "asd",
        "sender": "wew"
    },
    "1675223776618": {
        "author": "aaa",
        "new": true,
        "old": null,
        "property": "asd",
        "sender": "ewew"
    },
    "1675223776619": {
        "author": "bbb",
        "new": "ySk2btk7",
        "old": null,
        "property": "qwe",
        "sender": "yyy"
    }
}

I would like to extract this JSON into this format using SQL in BigQuery.

Note, the header id isn't defined in the JSON.

doc_id id author new old property sender
222 1675223776617 aaa 2023-02-01 null asd wew
222 1675223776618 aaa true null asd ewew
222 1675223776619 bbb ySk2btk7 null qwe yyy

I tried using the JSON_EXTRACT function without any success.

CodePudding user response:

You might consider below approach using javascript UDF.

CREATE TEMP FUNCTION flatten_json(json STRING)
RETURNS ARRAY<STRUCT<id STRING, author STRING, new STRING, old STRING, property STRING, sender STRING>>
LANGUAGE js AS """
  result = [];
  for (const [key, value] of Object.entries(JSON.parse(json))) {
    value["id"] = key; result.push(value);
  }
  return result;
""";

WITH sample_table AS (
  SELECT 222 doc_id, '''{
    "1675223776617": {
        "author": "aaa",
        "new": "2023-02-01",
        "old": null,
        "property": "asd",
        "sender": "wew"
    },
    "1675223776618": {
        "author": "aaa",
        "new": true,
        "old": null,
        "property": "asd",
        "sender": "ewew"
    },
    "1675223776619": {
        "author": "bbb",
        "new": "ySk2btk7",
        "old": null,
        "property": "qwe",
        "sender": "yyy"
    }
  }''' data
)
SELECT doc_id, flattened.*
  FROM sample_table, UNNEST(flatten_json(json)) flattened;

Query results

enter image description here

  • Related