Home > Software design >  Snowflake SQL - Count Distinct Users within descending time interval
Snowflake SQL - Count Distinct Users within descending time interval

Time:06-15

I want to count the distinct amount of users over the last 60 days, and then, count the distinct amount of users over the last 59 days, and so on and so forth.

Ideally, the output would look like this (TARGET OUTPUT)

Day Distinct Users
60  200
59  200
58  188
57  185
56  180
[...]   [...]

where 60 days is the max total possible distinct users, and then 59 would have a little less and so on and so forth.

my query looks like this.

select
        count(distinct (case when datediff(day,DATE,current_date) <= 60 then USER_ID end)) as day_60,
        count(distinct (case when datediff(day,DATE,current_date) <= 59 then USER_ID end)) as day_59,
        count(distinct (case when datediff(day,DATE,current_date) <= 58 then USER_ID end)) as day_58

FROM Table

The issue with my query is that This outputs the data by column instead of by rows (like shown below) AND, most importantly, I have to write out this logic 60x for each of the 60 days.

Current Output:

Day_60  Day_59  Day_58
209     207     207

Is it possible to write the SQL in a way that creates the target as shown initially above?

CodePudding user response:

You can do unpivot after getting your current output. sample one. select * from ( select 209 Day_60, 207 Day_59, 207 Day_58 )unpivot ( cnt for days in (Day_60,Day_59,Day_58));

CodePudding user response:

Using below data in CTE format -

with data_cte(dates,userid) as
(select * from values
('2022-05-01'::date,'UID1'),
('2022-05-01'::date,'UID2'),
('2022-05-02'::date,'UID1'),
('2022-05-02'::date,'UID2'),
('2022-05-03'::date,'UID1'),
('2022-05-03'::date,'UID2'),
('2022-05-03'::date,'UID3'),
('2022-05-04'::date,'UID1'),
('2022-05-04'::date,'UID1'),
('2022-05-04'::date,'UID2'),
('2022-05-04'::date,'UID3'),
('2022-05-04'::date,'UID4'),
('2022-05-05'::date,'UID1'),
('2022-05-06'::date,'UID1'),
('2022-05-07'::date,'UID1'),
('2022-05-07'::date,'UID2'),
('2022-05-08'::date,'UID1')
)

Query to get all dates and count and distinct counts -

select dates,count(userid) cnt, count(distinct userid) cnt_d
from data_cte
group by dates;
DATES CNT CNT_D
2022-05-01 2 2
2022-05-02 2 2
2022-05-03 3 3
2022-05-04 5 4
2022-05-05 1 1
2022-05-06 1 1
2022-05-08 1 1
2022-05-07 2 2

Query to get difference of date from current date

select dates,datediff(day,dates,current_date()) ddiff,
count(userid) cnt,
count(distinct userid) cnt_d
from data_cte
group by dates;
DATES DDIFF CNT CNT_D
2022-05-01 45 2 2
2022-05-02 44 2 2
2022-05-03 43 3 3
2022-05-04 42 5 4
2022-05-05 41 1 1
2022-05-06 40 1 1
2022-05-08 38 1 1
2022-05-07 39 2 2

Get records with date difference beyond a certain range only - include clause having

select datediff(day,dates,current_date()) ddiff,
count(userid) cnt,
count(distinct userid) cnt_d
from data_cte
group by dates
having ddiff<=43;
DDIFF CNT CNT_D
43 3 3
42 5 4
41 1 1
39 2 2
38 1 1
40 1 1

If you need to prefix 'day' to each date diff count, you can add and outer query to previously fetched data-set and add the needed prefix to the date diff column as following -

I am using CTE syntax, but you may use sub-query given you will select from table -

,cte_1 as (
select datediff(day,dates,current_date()) ddiff,
count(userid) cnt,
count(distinct userid) cnt_d
from data_cte
group by dates
having ddiff<=43)
select 'day_'||to_char(ddiff) days,
cnt,
cnt_d
from cte_1;
DAYS CNT CNT_D
day_43 3 3
day_42 5 4
day_41 1 1
day_39 2 2
day_38 1 1
day_40 1 1
  • Related