Home > Software engineering >  Rolling Quarter Calculation by Month While Using Another Column to group by
Rolling Quarter Calculation by Month While Using Another Column to group by

Time:05-20

I am trying to calculate rolling percentages for quarterly time periods. Sample data via sql is below:

create table #demo (Month char(7), Type varchar(10), Mileage int);
--low
insert into #demo values ('2021-05', 'low', 5000);
insert into #demo values ('2021-06', 'low', 15000);
insert into #demo values ('2021-07', 'low', 3000);
insert into #demo values ('2021-08', 'low', 3500);

-- med
insert into #demo values ('2021-05', 'med', 6789);
insert into #demo values ('2021-06', 'med', 12876);
insert into #demo values ('2021-07', 'med', 1578);
insert into #demo values ('2021-08', 'med', 3500);


--high
insert into #demo values ('2021-05', 'high', 5000);
insert into #demo values ('2021-06', 'high', 1500);
insert into #demo values ('2021-07', 'high', 2700);
insert into #demo values ('2021-08', 'high', 2968);
Month   Type    Mileage
2021-05 high    5000
2021-05 low     5000
2021-05 med     6789
2021-06 high    1500
2021-06 low     15000
2021-06 med     12876
2021-07 high    2700
2021-07 low     3000
2021-07 med     1578
2021-08 high    2968
2021-08 low     3500
2021-08 med     3500

My goal is to take this data and calculate the rolling 3 month percentage for each category individually.

Example: for the first 3 months of 2021 I would desire to get the sum(low mileage), sum(med mileage), sum (high mileage) for 2021-05, 2021-06, 2021-07 and then divide each sum by the total of those three months (for all categories). I'm imagining a PIVOT solution would be ideal here. I'm just not sure how to go about it.

So I would expect to see rolling totals like:

Month    ThreeMonth_TotalMileage ThreeMonth_Low  ThreeMonth_Med ThreeMonth_High
2021-05  NULL                    NULL            NULL           NULL
2021-06  NULL                    NULL            NULL           NULL
2021-07  53443                   23000           21243          9200
2021-08  46622                   18800           17954          7168

E.g. ThreeMonth_TotalMileage is the sum of all mileage for 2021-05, 2021-06, 2021-07. This will only show as a running total in 2021-07 since we have 3 months of history we can calculate a sum from, otherwise I'd like it to be null.

3Month_Low is the sum of all low mileage for 2021-05, 2021-06, 2021-07, basically its inclusive of the current month plus the prior 2 for each calc.

Any ideas on how I could try to accomplish this using SQL Server? Thank you!

CodePudding user response:

You can achieve that using Ordered Analytical Function. The idea is to find cumulative sum.

For Pivoting the data from row to column, you can use CASE and MAX.

select
Month
,Max(case when Type='low' then Mileage end)  as Month_Low
,Max(case when Type='med' then Mileage end)as Month_Med
,Max(case when Type='high' then Mileage end)  as Month_High
,case when ROW_NUMBER() over(order by Month asc)>=3 Then 1 end as is_3month_data_present
from #demo
group by month
Month Month_Low Month_Med Month_High is_3month_data_present
2021-05 5000 6789 5000 null
2021-06 15000 12876 1500 null
2021-07 3000 1578 2700 1
2021-08 3500 3500 2968 1

Below SQL should give the desired final result.

select
Month
,case when is_3month_data_present=1 then Sum(Month_Low Month_Med Month_High) over(order by Month ROWS BETWEEN 2 PRECEDING and CURRENT ROW )  end as ThreeMonth_TotalMileage   
,case when is_3month_data_present=1 then Sum(Month_Low) over(order by Month ROWS BETWEEN 2 PRECEDING and CURRENT ROW )  end as ThreeMonth_Low  
,case when is_3month_data_present=1 then Sum(Month_Med ) over(order by Month ROWS BETWEEN 2 PRECEDING and CURRENT ROW ) end as ThreeMonth_Med 
,case when is_3month_data_present=1 then Sum(Month_High) over(order by Month ROWS BETWEEN 2 PRECEDING and CURRENT ROW ) end  as ThreeMonth_High
from
(select
Month
,Max(case when Type='low' then Mileage end)  as Month_Low
,Max(case when Type='med' then Mileage end)as Month_Med
,Max(case when Type='high' then Mileage end)  as Month_High
,case when ROW_NUMBER() over(order by Month asc)>=3 Then 1 end as is_3month_data_present
from #demo
group by Month
)stg

Output:

Month ThreeMonth_TotalMileage ThreeMonth_Low ThreeMonth_Med ThreeMonth_High
2021-05 null null null
2021-06 null null null
2021-07 53443 23000 21243 9200
2021-08 46622 21500 17954 7168

Note: Expected output provided in the question is wrong as per the dataset given. ThreeMonth_low is wrong, rest all are correct. For 2021-07, actual value should be 23000 instead of 20300.

DB Fiddle: Try it here

  • Related