Home > Mobile >  R's ceiling_date equivalent in SQL
R's ceiling_date equivalent in SQL

Time:05-07

I want to implement R's ceiling_date fucntion in SQL (Postgresql).

So I have dates in a column for everyday with corresponding sales and I want to accumulate the sales for a week over a single date (say Friday).

Input Format:

Input format
Dates in yellow are the dates to aggregate sales on

Expected output format:

Expected Output format

This can easily be done in R using ceiling_date but I want to do it in SQL itself.

Any help would be appreciated. Thanks

CodePudding user response:

Accepting and processing the ISO 8601 Standard is by far the easiest for processing date ranges. But this imposes a standard definition, which is essentially:

  1. All weeks consist on exactly 7 days.
  2. All weeks begin on Monday.
  3. The first week of the year is the week the contains 4-Jan.

The date_trunc function gives the first date of the week, adding 6 gives the last day of the week.

-- ISO 8601 Week definition 
select (date_trunc('week',dte)::date  6)  "Week Ending"
     , sum(sales)                         "Total Sales"
  from test    
  group by (date_trunc('week',dte)::date  6)
  order by (date_trunc('week',dte)::date  6);

Date/Week processing for non ISO 8601 presents somewhat tricky process to get the appropriate week definition. The following does so for week Friday - Thursday definition. It creates a date range for a year beginning with the first Friday in the table, then joins using the range contains operator to determine the appropriate summation period

with periods (wk) as 
     ( select daterange( ((min_dt   (n-1) * interval '1 week'))::date     
                       , ((min_dt   (n)   * interval '1 week'))::date 
                       , '(]'
                       ) 
        from (select min(dte) min_dt 
                from test
               where extract(dow from dte) = 5     --- Day_Of_Week (5) = Friday
             ) s
        cross join generate_series(0,52) gs(n) 
     ) --select * from periods;
select upper(wk)-1   "Week Ending" 
    ,  sum(sales)    "Total Sales"
  from periods
  join test 
    on (dte <@ wk)
group by upper(wk)-1 
order by upper(wk)-1;

See demo of both here. NOTE: Demo changes sample date from January (2022-01-01 ...) to May (2022-05-01 ...) as 6-January-2022 was Thursday not Friday as description, 6-May-2022 is however Friday. Also the sum of values ending 6-May is 38 (not 42 as indicated). Finally, neither query attempts a limiting date, but processed through end-of-data. Nor does either address multiple years of data.

CodePudding user response:

demo

idea: for 2022-Janurary-1 to 2022-Janurary-20, there is 3 Fridays:'2022-01-07','2022-01-14', '2022-01-21'. We need to partition by these 3 friday order by sales date. Now the problem is now to compute get all these date belong to these 3 fridays.

  • get every friday each sales_date belong to.
  • deal with special cases(one week after friday: saturday, sunday) when sales_date > friday then the real friday is next friday.

final code:

SELECT
    *,
    sum(amount) OVER (PARTITION BY sales.compute_friday ORDER BY sales_date)
FROM
    sales;

processing code:

BEGIN;
CREATE TABLE sales (
    sales_date date
    , amount numeric
);
INSERT INTO sales (sales_date , amount)
SELECT
    i
    , (random() * 10)::integer
FROM
    generate_series('2022-01-01'::timestamp , '2022-01-20'::timestamp , interval '1 day') g (i);
ALTER TABLE sales
    ADD COLUMN friday date;
UPDATE
    sales
SET
    friday = (date_trunc('week' , sales_date)   interval '4 day')::date;
ALTER TABLE sales
    ADD COLUMN compute_friday date;
UPDATE
    sales
SET
    compute_friday = CASE WHEN sales_date > friday THEN
        (friday   interval '7 days')::date
    ELSE
        friday
    END;
COMMIT;
  • Related