Home > Software engineering >  Create a two-weeks window frame
Create a two-weeks window frame

Time:10-07

I have a dataset that's just a list of orders made by customers each day.

order_date month week customer
2022-10-06 10 40 Paul
2022-10-06 10 40 Edward
2022-10-01 10 39 Erick
2022-09-26 9 39 Divine
2022-09-23 9 38 Alice
2022-09-21 9 38 Evelyn

My goal is to calculate the total number of unique customers within a two-week period. I can count the number of customers within a month or week period but not two weeks. Also, the two weeks are in a rolling order such that weeks 40 and 39 (as in the sample above) is one window period while weeks 39 and 38 is the next frame.

So far, this is how I am getting the monthly and weekly numbers. Assume that the customer names are distinct per day.

select order_date,
       month,
       week,
       COUNT(DISTINCT customer) over (partition by month) month_active_outlets,
       COUNT(DISTINCT customer) OVER (partition by week) week active outlets,
from table

Again, I am unable to calculate the unique customer names within a two-week period.

CodePudding user response:

I think the easiest would be to create your own grouper in a subquery and then use that to get to your count. Currently, COUNT UNIQUE and ORDER BY in the window is not supported, therefore that approach wouldn't work.

A possible query could be:

WITH
  week_before AS (
  SELECT
    EXTRACT(WEEK from order_date) as week, --to be sure this is the same week format
    month,
    CONCAT(week,'-', EXTRACT(WEEK FROM DATE_SUB(order_date, INTERVAL 7 DAY))) AS two_weeks,
    customer
  FROM
    `test`.`Basic`)
SELECT
  two_weeks,
  COUNT(DISTINCT customer) AS unique_customer
FROM
  week_before
GROUP BY
  two_weeks

CodePudding user response:

The window function is the right tool. To obtain the 2 week date, we first extract the week number of the year:

mod(extract(week from order_date),2)

If the week number is odd (modulo 2) we add a week. Then we trunc to the start of (the even) week.

date_trunc(date_add(order_date,interval mod(extract(week from order_date),2) week),week )
with tbl as 
(Select date("2022-10-06") as order_date, "Paul" as customer 
union all select date("2022-10-06"),"Edward"
union all select date("2022-10-01"),"Erick"
union all select date("2022-09-26"),"Divine"
union all select date("2022-09-23"),"Alice"
union all select date("2022-09-21"),"Evelyn"
 )

select *, 
date_trunc(order_date,month) as month,
date_trunc(order_date,week) as week,
 COUNT(DISTINCT customer) OVER week2 as customer_2weeks,
 string_agg(cast(order_date as string)) over week2 as list_2weeks,
from tbl
window week2 as (partition by date_trunc(date_add(order_date,interval mod(extract(week from order_date),2) week),week ))

The first days of a year are counted to the last week of the previous year:

select order_date,
extract(isoweek from order_date),
date_trunc(date_add(order_date,interval mod(extract(week from order_date),2) week),week)
from
unnest(generate_date_array(date("2021-12-01"),date("2023-01-14"))) order_date
order by 1
  • Related