Home > Software engineering >  extracting data from unstructured JSON in big query
extracting data from unstructured JSON in big query

Time:07-13

I have JSON as a string in a big query field:

[{"name":"user_group","value":"regular"},{"name":"checkout_version","value":"2.2"},{"name":"currency","value":"EUR"},{"name":"currency_exchange_rate","value":"1"},{"name":"currency_symbol","value":"€"},{"name":"variant","value":"default"},{"name":"snowplow_id","value":"XXXXXXX"},{"name":"ip_address","value":"XXXX"},{"name":"user_agent","value":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36"},{"name":"is_test_order","value":"false"}]

I'm going crazy trying to extract the value ("default") from this section:

{"name":"variant","value":"default"}

the part I want will always follow "name":"variant","value":" and have a " at the end.

I have tried json_extract but regexp_extract seems the best option, I tried this: (

select REGEXP_EXTRACT(json_string_field, r'/\{"value":"([^"] )"/') as variant
from source_table

)

but I'm just getting nulls back...would appreciate ideas...

CodePudding user response:

consider below query

WITH json_data AS (
  SELECT '[{"name":"user_group","value":"regular"},{"name":"checkout_version","value":"2.2"},{"name":"currency","value":"EUR"},{"name":"currency_exchange_rate","value":"1"},{"name":"currency_symbol","value":"€"},{"name":"variant","value":"default"},{"name":"snowplow_id","value":"XXXXXXX"},{"name":"ip_address","value":"XXXX"},{"name":"user_agent","value":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36"},{"name":"is_test_order","value":"false"}]' json
)
SELECT JSON_VALUE(kv, '$.value') AS value
  FROM json_data, UNNEST(JSON_QUERY_ARRAY(json)) kv
 WHERE JSON_VALUE(kv, '$.name') = 'variant';

enter image description here

  • Related