We use FiveTran to extract our data from shopify and store it in BigQuery. The field "properties" within the order_line table contains what looks like an array of key/value pairs. In this case name/value. The field type is string here is an example of the contents
order_line_id properties
9956058529877 [{"name":"_order_bump_rule_id","value":"4afx7cbw6"},{"name":"_order_bump_bump_id","value":"769d1996-b6fb-4bc3-8d41-c4d7125768c5"},{"name":"_source","value":"order-bump"}]
4467731660885 [{"name":"shipping_interval_unit_type","value":null},{"name":"charge_delay","value":null},{"name":"charge_on_day_of_week","value":null},{"name":"charge_interval_frequency","value":null},{"name":"charge_on_day_of_month","value":null},{"name":"shipping_interval_frequency","value":null},{"name":"number_charges_until_expiration","value":null}]
4467738738773 [{"name":"shipping_interval_unit_type","value":null},{"name":"charge_delay","value":null},{"name":"charge_on_day_of_week","value":null},{"name":"charge_interval_frequency","value":null},{"name":"charge_on_day_of_month","value":null},{"name":"shipping_interval_frequency","value":null},{"name":"number_charges_until_expiration","value":null}]
4578798600277 [{"name":"shipping_interval_unit_type","value":null},{"name":"charge_interval_frequency","value":null},{"name":"shipping_interval_frequency","value":null}]
I am trying to write a query that generate one row per record with a column for each of these name values:
- shipping_interval_unit_type
- charge_on_day_of_week
- charge_interval_frequency
- charge_on_day_of_month
- subscription_id
- number_charges_until_expiration
- shipping_interval_frequency
and the corresponding "value". This field "properties" can contain many different "name" values and they can be in different order each time. The "name" values noted above are not always present in the "properties" field.
I've tried json functions but it doesn't seem to be properly formatted for json. I've tried unnesting it but that fails since it is a string.
CodePudding user response:
Consider below approach
select * from (
select order_line_id,
json_extract_scalar(property, '$.name') name,
json_extract_scalar(property, '$.value') value
from your_table, unnest(json_extract_array(properties)) property
)
pivot (min(value) for name in (
'shipping_interval_unit_type',
'charge_on_day_of_week',
'charge_interval_frequency',
'charge_on_day_of_month',
'subscription_id',
'number_charges_until_expiration',
'shipping_interval_frequency'
))