Home > Software engineering >  Snowflake SQL : How do I insert and build in a array : all dates between a specific start and a spec
Snowflake SQL : How do I insert and build in a array : all dates between a specific start and a spec

Time:01-27

Given two dates, activation_date and termination_date per customer Imagine there is 20 days between the activation_date and the termination_date.

How do I get by customer an array which start with the activation_date and continue by day to the termination_date ?

I try using :

SELECT
  customer_id,
  first_day,
  last_day,
  ARRAY_AGG(TO_DATE(MY_DATE)) WITHIN GROUP (ORDER BY MY_DATE asc)
FROM (
  SELECT
    customer_id,
    activation_day as first_day,
    termination_date_clean_formatted as last_day,
    TO_DATE(dateadd(day, SEQ4(), first_day)) AS MY_DATE
  FROM v
  ,TABLE(GENERATOR(ROWCOUNT=>(20000)))
  WHERE MY_DATE <= last_day
)
group by subscription_id, first_day, last_day
)
select * 
from test

But unfortunatly it doesn't work at all, the results is completly random by customer_id, MY_DATE is not even starting at the same date as the first_day and i got only 8 results max per customer_id which is impossible. The result I'm excepting to have is :

customer_id array
546464654 [ "2022-01-02", "2022-01-03"....]
116541165 [ "2022-05-06", "2022-05-07"....]

Thanks for helping :)

CodePudding user response:

You can generate an array with all dates between dates with a JS UDF:

create or replace function gen_dates_array(SINCE date, UNTIL date)
returns variant
language javascript
as $$

dates = [];
currentDate = SINCE;
while (currentDate < UNTIL) {
    dates.push(new Date(currentDate));
    currentDate.setDate(currentDate.getDate()   1);
}
dates.push(UNTIL);

return dates.map(x => x.toISOString().split('T')[0]);
$$;

For example:

select gen_dates_array('2020-01-10'::date, '2020-01-13'::date)

-- [   "2020-01-10",   "2020-01-11",   "2020-01-12",   "2020-01-13" ]

CodePudding user response:

enter image description here

An alternative SQL only approach using ARRAY_AGG(),DATEDIFF(), FLATTEN(), REPEAT():

WITH SOURCE as (
SELECT   
    DATEADD(day, uniform(1, 365, random(12)), '2020-01-01')::DATE AS start_date
,   DATEADD(day, uniform(5,20, random(120)),  start_date)::DATE end_date
,   DATEDIFF(DAY,START_DATE,END_DATE) CUSTOMER_LENGTH_DAYS
,   uniform(1, 10, random(12))  customer_id 
 FROM TABLE(GENERATOR(rowcount => 10)) v ) 

SELECT   CUSTOMER_ID
        ,START_DATE
        ,END_DATE
        ,ARRAY_AGG( DATEADD(day,INDEX,START_DATE)::DATE) VOILA
FROM SOURCE
,LATERAL FLATTEN(INPUT=>STRTOK_TO_ARRAY(REPEAT('1~',CUSTOMER_LENGTH_DAYS),'~')) 
GROUP BY 1,2,3
  • Related