Home > Software design >  Select a value of a column based on a max value of another aggregation
Select a value of a column based on a max value of another aggregation

Time:06-23

I have the following records:

enter image description here

And i need the results to show a single record of the account, the account_type of the max(original_valid_to) and the sum of the gross_revenue.

enter image description here

Is there an easy way i could do it in Bigquery?

CodePudding user response:

First that comes to mind is using a window function qualify:

select
  account,
  account_type,
  sum(gross_revenue) over (partition by account) as sum_gross_revenue
from your_table
where 1=1
qualify row_number() over (partition by account order by original_valid_to desc) = 1
  • sum(gross_revenue) over (partition by account) as sum_gross_revenue is summing all the fields by account, meaning that if your sample data had two different accounts, it would perform two different sums
  • where 1=1 is a redundant clause, but needed since to use the qualify clause in BigQuery you need to specify a where
  • qualify is performing something like a where but at the end of the query execution, meaning that we only get to show the last row based on a descending order of original_valid_to

CodePudding user response:

You can achieve this by using windowed functions. In the below example I have abbreviated some of the values:

with sample_data as (
  select '00000a' as account_id,  12345 as subscription_id, 'Free' as account_type, timestamp'2021-03-08T12:41:14.001Z' as original_valid_to, 0 as gross_revenue union all
  select '00000a' as account_id,  12346 as subscription_id, 'Free' as account_type, timestamp'2021-08-19T14:19:25.122Z' as original_valid_to, 45.985 as gross_revenue union all
  select '00000a' as account_id,  12347 as subscription_id, 'Premium' as account_type, timestamp'2023-06-14T08:39:10.535Z' as original_valid_to, 43.302 as gross_revenue 
)

select account_id,
  subscription_id,
  account_type,
  original_valid_to,
  sum(gross_revenue) over (partition by sample_data.account_id)
from sample_data
qualify row_number() OVER (partition by account_id order by original_valid_to desc) = 1

In the above example the window function definition in the sum allows it to not be calculated in context of the qualify statement. Additionally the qualify allows for the filtering down to just the latest row.

Qualify Clause

Window Functions

  • Related