Home > front end >  Split Json into multiple records based on multiple Strings Bigquery
Split Json into multiple records based on multiple Strings Bigquery

Time:10-18

I have input data in the below format

id| data                                                                              
------------------------------------------------------------------------------------------------------------------------
1 | {"desc": "ABC", "sno": 22, "maincode": 3345},{"desc": "XYZ", "sno": 41, "maincode": 3100}
2 | {"value": 3340,"desc": "ABC", "sno": 22, "maincode": 3345},{"value": 400,"desc": "XYZ", "sno": 22, "maincode": 2205}

In the above data I have json multiple times within the same record. I would like to split these jsons into multiple rows. Basically, each json is split with comma. I was more thinking of using },{ in my split function and then output into multiple records, but I will have to concatenate } and { the moment i split them which is kind of inappropriate.

Expected Output data:

id| data                                                                              
-------------------------------------------------------------------
1 | {"desc": "ABC", "sno": 22, "maincode": 3345}
1 | {"desc": "XYZ", "sno": 41, "maincode": 3100}
2 | {"value": 3340,"desc": "ABC", "sno": 22, "maincode": 3345}
2 | {"value": 400,"desc": "XYZ", "sno": 22, "maincode": 2205}

The catch here is the jsons can start with different set of keys ( I have the list of them ). For example, the first record in the input data starts with the key "desc". The second record in the input json starts with "value". I need to consider this appropriately.

Is there any better solution for this?

CodePudding user response:

I thinks you can add [ and ] to make data string a valid json array and apply JSON functions.

WITH sample_data AS (
  SELECT 1 id, '{"desc": "ABC", "sno": 22, "maincode": 3345},{"desc": "XYZ", "sno": 41, "maincode": 3100}' data UNION ALL
  SELECT 2 id, '{"value": 3340,"desc": "ABC", "sno": 22, "maincode": 3345},{"value": 400,"desc": "XYZ", "sno": 22, "maincode": 2205}'
)
SELECT id, data
  FROM sample_data, UNNEST(JSON_QUERY_ARRAY('[' || data || ']')) data;

enter image description here

CodePudding user response:

You can try the function JSON_EXTRACT_SCALAR and other JSON_EXTRACT functions that BQ offers. They are pretty well described in the documentation. The example code would look like this:

SELECT 
  id,
  JSON_EXTRACT_SCALAR(data, "$.desc") desc,
  JSON_EXTRACT_SCALAR(data, "$.sno") sno,
  JSON_EXTRACT_SCALAR(data, "$.maincode)
FROM 
  sample_data

This code should give you a neat table with 4 columns.

  • Related