Home > Software design >  Counting total IDs in a table for changing dates using Snowflake
Counting total IDs in a table for changing dates using Snowflake

Time:07-21

So let's say I have some data as follows. It is a table where the distinct identifier for each row is the ID the MONTH_END.

TABLE1

ID  MONTH_END        data
001 2022-03-31      grape
001 2022-04-30     orange
001 2022-05-31  pineapple
002 2021-01-31      berry
002 2021-02-28     banana
003 2019-02-28     orange   
003 2021-12-31      grape     
003 2022-01-31     banana      
003 2019-02-28     orange    

My goal here is I want to make a new variable called frequency where I take a new table and determine how many distinct times the ID appears before and during the MONTH_END in table1.

TABLE2

ID   freq_date  freq_val
001 2022-03-31         4
002 2022-02-20       242
003 2021-12-31       122
004 2020-07-02       342
001 2021-08-23        44
001 2021-01-13      7347
003 2021-01-21       212

Additionally, I would like the sum of the freq_vals in the same fashion called freq_val2.

Final expected output:

ID  MONTH_END        data  frequency freq_val2
001 2022-03-31      grape          3      7395  
001 2022-04-30     orange          3      7395
001 2022-05-31  pineapple          3      7395
002 2021-01-31      berry          0         0
002 2021-02-28     banana          1       242
003 2019-02-28     orange          0         0
003 2021-12-31      grape          1       122
003 2022-01-31     banana          2       334
003 2019-02-28     orange          2       334

CodePudding user response:

I think the logic is correct in the query, but I'm unable to obtain the same results as shown above - please read my comment regarding some questions about your test data (duplicate rows, and values that exceed the logic threshold described).

   select
        t1.id ,
        t1.month_end,
        t1.data,
        count(t2.id) as frequency,
        nvl(sum(t2.freq_val),0) as freq_val2
    from
        table_1 t1
        left join  table_2 t2 on t1.id = t2.id
        and t2.freq_date <= t1.month_end
        group by 1,2,3 order by 1,2;

Results:

ID  MONTH_END   DATA        FREQUENCY   FREQ_VAL2
001 2022-03-31  grape       3           7395
001 2022-04-30  orange      3           7395
001 2022-05-31  pineapple   3           7395
002 2021-01-31  berry       0           0
002 2021-02-28  banana      0           0
003 2019-02-28  orange      0           0
003 2021-12-31  grape       2           334
003 2022-01-31  banana      2           334
  

CodePudding user response:

WITH CTE AS (
 select '001' ID,  '2022-03-31'::DATE MONTH_END,      'grape' DATA UNION ALL
 select '001' ID,  '2022-04-30'::DATE MONTH_END,     'orange' DATA UNION ALL
 select '001' ID,  '2022-05-31'::DATE MONTH_END,  'pineapple' DATA UNION ALL
 select '002' ID,  '2021-01-31'::DATE MONTH_END,      'berry' DATA UNION ALL
 select '002' ID,  '2021-02-28'::DATE MONTH_END,     'banana' DATA UNION ALL
 select '003' ID,  '2019-02-28'::DATE MONTH_END,     'orange' DATA UNION ALL   
 select '003' ID,  '2021-12-31'::DATE MONTH_END,      'grape' DATA UNION ALL     
 select '003' ID,  '2022-01-31'::DATE MONTH_END,     'banana' DATA UNION ALL      
 select '003' ID,  '2019-02-28'::DATE MONTH_END,     'orange' DATA )

 SELECT ID, MONTH_END, COUNT(1) FREQ, COUNT(1) OVER (PARTITION BY 1) 
 FROM CTE 
 GROUP BY
      ID, MONTH_END; 

enter image description here

  • Related