I have a BigQuery table that contains a column that contains a JSON string. Within the JSON, there may be a key called "person" or "corp" or "sme". I want to run a query that will return which of the possible keys exist in the JSON and store it in a new column.
Below is the data from a column 'class', which is one long string each in BQ. The first level key name can equal ‘corp’, ’sme’, or ‘person’ (see examples below).
Example 1
{
"corp": {
"address": {
"city": "London",
"countryCode": "gb",
"streetAddress": [
"Fairlop road"
],
"zip": "e111bn"
},
"cin": 1234567420,
"title": "Demo Corp"
}
}
Example 2
{
"person": {
"address": {
"city": "Madrid",
"countryCode": "es",
"streetAddress": [
"Some street 1"
],
"zip": "z1123ab"
},
"cin": 1234567411,
"title": "Demo Person"
}
}
I've tried using the json_xxx functions, but they require specifying the json_path. I'm interested in fetching the json_path name to create a new column (cust_type)which lists corp, sme, person for each row.
example | cust_type |
---|---|
1 | corp |
2 | person |
This is my first question so pls bear with me! Thnx
CodePudding user response:
Maybe we can use the JSON_EXTRACT
function and look to see if the field exists (is not null). An example test might be:
SELECT CASE
WHEN JSON_EXTRACT(json_text, '$.corp') is not null then 'corp'
WHEN JSON_EXTRACT(json_text, '$.person') is not null then 'person'
WHEN JSON_EXTRACT(json_text, '$.sme') is not null then 'sme'
END AS cust_type
FROM UNNEST([
'{"corp": {"address": {"city": "London","countryCode": "gb","streetAddress": ["Fairlop road"],"zip": "e111bn"},"cin": 1234567420,"title": "Demo Corp"}}',
'{"person": {"address": {"city": "Madrid","countryCode": "es","streetAddress": ["Some street 1"],"zip": "z1123ab"},"cin": 1234567411,"title": "Demo Person"}}'
]) AS json_text;
CodePudding user response:
Also you can use a function to extract first level keys whatever they are.
CREATE TEMP FUNCTION json_keys(input STRING) RETURNS ARRAY<STRING> LANGUAGE js AS """
return Object.keys(JSON.parse(input))
""";
SELECT json_keys(json_text) AS cust_type
FROM UNNEST([
'{"corp": {"address": {"city": "London","countryCode": "gb","streetAddress": ["Fairlop road"],"zip": "e111bn"},"cin": 1234567420,"title": "Demo Corp"}}',
'{"person": {"address": {"city": "Madrid","countryCode": "es","streetAddress": ["Some street 1"],"zip": "z1123ab"},"cin": 1234567411,"title": "Demo Person"}}'
]) AS json_text;
output: