I have a fairly simple string-formatted json column in a BigQuery database I am trying to flatten.
id | relationships |
---|---|
1 | {'ownerObject': {'data': None}, 'investmentObject': {'data': {'type': 'assets', 'id': '40'}}, 'securityObject': {'data': None}, 'segmentObject': {'data': None}, 'dataItemObject': {'data': {'type': 'dataItems', 'id': '13161'}}, 'scenarioObject': {'data': {'type': 'scenarios', 'id': '13'}}} |
Running:
SELECT
id,
JSON_EXTRACT_SCALAR(relationships, '$.investmentObject.data.type') as Invest_Type
FROM periodicData
Gives me:
id | Invest_Type |
---|---|
1 | null |
I would expect the column Invest_Type to be = asset. But no matter what I try on this json column, I always end up with null:
JSON_EXTRACT(relationships, '$.investmentObject.data.type') as Invest_Type
= nullJSON_EXTRACT(relationships, '$.dataItemObject.data.type') as Invest_Type
= nullJSON_EXTRACT(relationships, '$.dataItemObject') as Invest_Type
= nullJSON_EXTRACT(relationships, '$') as Invest_Type
= nullrelationships as r
= The full json (as expected)
I have been doing similar operations over the past week on a lot of tables with no issue, but somehow this table is stubborn. What could I be doing wrong?
How relationships looks in the schema
How relationships looks in the preview
Any help is appreciated!
CodePudding user response:
Your query works, but your json is not correctly formatted.
Change single quote to double quote and add quotes around "None".
a crude version:
with cleaned as (select id, replace(replace(relationships, "\'","\""),"None","\"None\"") as relationships from inputtable)
SELECT
id,
JSON_EXTRACT_SCALAR(relationships, '$.investmentObject.data.type') as Invest_Type
FROM cleaned;