Home > Mobile >  Calculate distinct number of phones purchased under contract as of end of any given month from the b
Calculate distinct number of phones purchased under contract as of end of any given month from the b

Time:06-08

I have a table that has a list of subscribers who have a phone under contract as of the end of each month. I need to calculate how many unique contract phones the subscriber had from the beginning as of the end of each month in BigQuery.

Sample Dataset:

Month ID Phone Model Sub ID Contract Start Date
May 2022 X1 S1 2022-04-01
Apr 2022 X1 S1 2022-04-01
Mar 2022 X2 S1 2022-02-01
Feb 2022 X2 S1 2022-02-01
May 2022 X3 S2 2022-04-01
Apr 2022 X3 S2 2022-04-01
Mar 2022 X3 S2 2022-03-01
Feb 2022 X4 S2 2022-02-01

Desired output:

Month ID Sub ID count of distinct phones as of month end
May 2022 S1 2
Apr 2022 S1 2
Mar 2022 S1 1
Feb 2022 S1 1
May 2022 S2 2
Apr 2022 S2 2
Mar 2022 S2 2
Feb 2022 S2 1

Please not that in case of sub ID - S2, there are 3 contract start dates which ideally should mean 3 devices but there are only 2 devices. As a result the count for May & Apr 2022 for S2 should be 2 instead of 3

CodePudding user response:

Consider below approach

select 
  format_date('%B %Y', month) as month_id, sub_id, 
  count(distinct phone_model) as count_of_distinct_phones_as_of_month_end
from your_table,
unnest(generate_date_array(
  date_trunc(contract_start_date, month), 
  date_trunc(current_date(), month), 
  interval 1 month
)) month
group by month_id, sub_id
order by sub_id, parse_date('%B %Y', month_id) desc        

if applied to sample data in your question - output is

enter image description here

CodePudding user response:

You can try it with a window function, I guess.

DECLARE sample ARRAY<STRUCT<month_id STRING, phone_model STRING, sub_id STRING, contract_start_date DATE>>
DEFAULT [
    ('May 2022', 'X1', 'S1', DATE '2022-04-01'),
    ('Apr 2022', 'X1', 'S1', '2022-04-01'),
    ('Mar 2022', 'X2', 'S1', '2022-02-01'),
    ('Feb 2022', 'X2', 'S1', '2022-02-01'),
    ('May 2022', 'X3', 'S2', '2022-04-01'),
    ('Apr 2022', 'X3', 'S2', '2022-04-01'),
    ('Mar 2022', 'X3', 'S2', '2022-03-01'),
    ('Feb 2022', 'X4', 'S2', '2022-02-01')
];

SELECT month_id, sub_id,
       (SELECT COUNT(DISTINCT p) FROM UNNEST(phones) p) AS count_of_distinct_phones_as_of_month_end 
  FROM (
    SELECT *, ARRAY_AGG(phone_model) OVER (PARTITION BY sub_id ORDER BY PARSE_DATE('%B %Y', month_id)) AS phones
      FROM UNNEST(sample)
  );

output will be:

enter image description here

  • Related