Home > Software design >  How to transform data into daily snapshot given the two date columns?
How to transform data into daily snapshot given the two date columns?

Time:01-18

I have product data in my table which looks similar to this

product_id user_id sales_start sales_end quantity
1 12 2022-01-01 2022-02-01 15
2 234 2022-11-01 2022-12-31 123

I want to transform the table into a daily snapshot so that it would look something like this:

product_id user_id quantity date
1 12 15 2022-01-01
1 12 15 2022-01-02
1 12 15 2022-01-03
... ... ... ...
2 234 123 2022-12-31

I know how to do a similar thing in Pandas, but I need to do it within AWS Athena. I thought of getting the date interval and unnest it, but I am struggling with mapping them properly.

Any ideas on how to transform data?

CodePudding user response:

This will help you sequence

SELECT product_id, user_id, quantity, date(date) as date FROM(    
    VALUES 
    (1, 12, DATE '2022-01-01', DATE '2022-02-01', 15),
    (2, 234, DATE '2022-11-01', DATE '2022-12-31', 123)
) AS t (product_id, user_id, sales_start, sales_end, quantity),
UNNEST(sequence(sales_start, sales_end, interval '1' day)) t(date)

CodePudding user response:

You can use sequnece to generate dates range and then unnest it:

-- sample data
with dataset(product_id, user_id, sales_start, sales_end, quantity) as (
    values (1,  12  , date '2022-01-01', date '2022-01-05', 15), -- short date ranges
        (2, 234, date '2022-11-01', date '2022-11-03', 123) -- short date ranges
)

-- query
select product_id, user_id, quantity, date
from dataset,
     unnest(sequence(sales_start, sales_end, interval '1' day)) as t(date);

Output:

product_id user_id quantity date
1 12 15 2022-01-01
1 12 15 2022-01-02
1 12 15 2022-01-03
1 12 15 2022-01-04
1 12 15 2022-01-05
2 234 123 2022-11-01
2 234 123 2022-11-02
2 234 123 2022-11-03
  • Related