Home > Software engineering >  extracting name/value pairs into columns and values - Shopify Data In Bigquery
extracting name/value pairs into columns and values - Shopify Data In Bigquery

Time:11-24

We use FiveTran to extract our data from shopify and store it in BigQuery. The field "properties" within the order_line table contains what looks like an array of key/value pairs. In this case name/value. The field type is string here is an example of the contents

order_line_id   properties
9956058529877   [{"name":"_order_bump_rule_id","value":"4afx7cbw6"},{"name":"_order_bump_bump_id","value":"769d1996-b6fb-4bc3-8d41-c4d7125768c5"},{"name":"_source","value":"order-bump"}]
4467731660885   [{"name":"shipping_interval_unit_type","value":null},{"name":"charge_delay","value":null},{"name":"charge_on_day_of_week","value":null},{"name":"charge_interval_frequency","value":null},{"name":"charge_on_day_of_month","value":null},{"name":"shipping_interval_frequency","value":null},{"name":"number_charges_until_expiration","value":null}]
4467738738773   [{"name":"shipping_interval_unit_type","value":null},{"name":"charge_delay","value":null},{"name":"charge_on_day_of_week","value":null},{"name":"charge_interval_frequency","value":null},{"name":"charge_on_day_of_month","value":null},{"name":"shipping_interval_frequency","value":null},{"name":"number_charges_until_expiration","value":null}]
4578798600277   [{"name":"shipping_interval_unit_type","value":null},{"name":"charge_interval_frequency","value":null},{"name":"shipping_interval_frequency","value":null}]

I am trying to write a query that generate one row per record with a column for each of these name values:

  • shipping_interval_unit_type
  • charge_on_day_of_week
  • charge_interval_frequency
  • charge_on_day_of_month
  • subscription_id
  • number_charges_until_expiration
  • shipping_interval_frequency

and the corresponding "value". This field "properties" can contain many different "name" values and they can be in different order each time. The "name" values noted above are not always present in the "properties" field.

I've tried json functions but it doesn't seem to be properly formatted for json. I've tried unnesting it but that fails since it is a string.

CodePudding user response:

Consider below approach

select * from (
  select order_line_id, 
    json_extract_scalar(property, '$.name') name,
    json_extract_scalar(property, '$.value') value
  from your_table, unnest(json_extract_array(properties)) property
)
pivot (min(value) for name in (
  'shipping_interval_unit_type',
  'charge_on_day_of_week',
  'charge_interval_frequency',
  'charge_on_day_of_month',
  'subscription_id',
  'number_charges_until_expiration',
  'shipping_interval_frequency'
))
  • Related