I have searched Stack Overflow to get an answer to my question, but while I found many interesting cases, none of them quite address mine.
I have a column called fields
in my data, that contains JSON information, such as presented below:
Row Fields
1 [{"label":"Label 1","key":"label_1","description":"Value of label_1"},{"label":"Label 2","key":"label_2","error":"Something"}]
2 [{"description":"something","label":"Row 1","key":"row_1"},{"label":"Row 2","message":"message_1","key":"row_2"}]
In essence, I have many rows of JSON that contain label
and key
, and bunch of other parameters like that. From every {}, I want to extract only label
and key
, and then (optional, but ideally) stretch every label
and key
in every {} to its own row. So, as a result, I would have the following output:
Row Label Key
1 Label 1 label_1
1 Label 2 label_2
2 Row 1 row_1
2 Row 2 row_2
Please note, contents of label
and key
within JSON can be anything (strings, integers, special characters, a mix of everything, etc. In addition, key
and label
can be anywhere in relation to other parameters within each {}.
Here is the Big Query SQL dummy data for convenience:
SELECT '1' AS Row, '[{"label":"Label 1","key":"label_1","description":"Value of label_1"},{"label":"Label 2","key":"label_2","error":"Something"}]' AS Fields
UNION ALL
SELECT '2' AS Row, '[{"description":"something","label":"Row 1","key":"row_1"},{"label":"Row 2","message":"message_1","key":"row_2"}]' AS Fields
I have first thought of using REGEX to isolate all the brackets and only show me information with label
and key
. Then, I looked into BQ Documentation of JSON functions and got very stuck on json_path parameters, specifically because their example doesn't match mine.
CodePudding user response:
Consider below approach
select `row`,
json_extract_scalar(el, '$.label') label,
json_extract_scalar(el, '$.key') key
from your_table, unnest(json_extract_array(fields)) el
if applied to sample data in your question - output is