Home > Net >  Count total customer_id's not partitioned by column
Count total customer_id's not partitioned by column

Time:03-26

I would like to calculate the total number of customers without adding an additional subquery. The count should be partitioned by country but rather by the month_ column.

EDIT: I updated the query to use GROUPING SETS

Current query:

select date_trunc('month',date_) as month_,
country,
count(distinct customer_id) as total_customers 
GROUP BY GROUPING SETS (
   (date_trunc('month',date_), country),
   (date_trunc('month',date_))
from table_a 

Current output

month_        country  total_customers_per_country   
2020-01-01    US       320                          
2020-01-01    GB       360                           
2020-01-01             680
2020-02-01    US       345                          
2020-02-01    GB       387
2020-02-01             732                          

Desired output:

month_        country  total_customers_per_country   total_customers
2020-01-01    US       320                           680
2020-01-01    GB       360                           680
2020-02-01    US       345                           732
2020-02-01    GB       387                           732

CodePudding user response:

This may depend on the version of sql server you are using but you are likely looking for "window" functions.

I believe something along the lines of the following will give you the result you are looking for:

select date_trunc('month',date_) as month_,
country,
count(distinct customer_id) as total_customers_by_country,
count(distinct customer_id) OVER (partition by date_trunc('month',date_)) as total_customers
from table_a

https://docs.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql?view=sql-server-ver15

CodePudding user response:

You can perform subquery to group by month-country pair and then use sum over window partitioned by month:

-- sample data
WITH dataset (id, date, country) AS (
    VALUES (1, date '2020-01-01', 'US'),
        (2, date '2020-01-01', 'US'),
        (1, date '2020-01-01', 'GB'),
        (3, date '2020-01-02', 'US'),
        (1, date '2020-01-02', 'GB'),
        (1, date '2020-02-01', 'US')
) 

--query
select *,
    sum(total_customers_per_country) over (partition by month) total_customers
from (
        select date_trunc('month', date) as month,
            country,
            count(distinct id) total_customers_per_country   
        from dataset
        group by 1, country
    )
order by month, country desc

Output:

month country total_customers_per_country total_customers
2020-01-01 US 3 4
2020-01-01 GB 1 4
2020-02-01 US 1 1
  • Related