I am trying to calculate rolling percentages for quarterly time periods. Sample data via sql is below:
create table #demo (Month char(7), Type varchar(10), Mileage int);
--low
insert into #demo values ('2021-05', 'low', 5000);
insert into #demo values ('2021-06', 'low', 15000);
insert into #demo values ('2021-07', 'low', 3000);
insert into #demo values ('2021-08', 'low', 3500);
-- med
insert into #demo values ('2021-05', 'med', 6789);
insert into #demo values ('2021-06', 'med', 12876);
insert into #demo values ('2021-07', 'med', 1578);
insert into #demo values ('2021-08', 'med', 3500);
--high
insert into #demo values ('2021-05', 'high', 5000);
insert into #demo values ('2021-06', 'high', 1500);
insert into #demo values ('2021-07', 'high', 2700);
insert into #demo values ('2021-08', 'high', 2968);
Month Type Mileage
2021-05 high 5000
2021-05 low 5000
2021-05 med 6789
2021-06 high 1500
2021-06 low 15000
2021-06 med 12876
2021-07 high 2700
2021-07 low 3000
2021-07 med 1578
2021-08 high 2968
2021-08 low 3500
2021-08 med 3500
My goal is to take this data and calculate the rolling 3 month percentage for each category individually.
Example: for the first 3 months of 2021 I would desire to get the sum(low mileage), sum(med mileage), sum (high mileage) for 2021-05, 2021-06, 2021-07 and then divide each sum by the total of those three months (for all categories). I'm imagining a PIVOT solution would be ideal here. I'm just not sure how to go about it.
So I would expect to see rolling totals like:
Month ThreeMonth_TotalMileage ThreeMonth_Low ThreeMonth_Med ThreeMonth_High
2021-05 NULL NULL NULL NULL
2021-06 NULL NULL NULL NULL
2021-07 53443 23000 21243 9200
2021-08 46622 18800 17954 7168
E.g. ThreeMonth_TotalMileage is the sum of all mileage for 2021-05, 2021-06, 2021-07. This will only show as a running total in 2021-07 since we have 3 months of history we can calculate a sum from, otherwise I'd like it to be null.
3Month_Low is the sum of all low mileage for 2021-05, 2021-06, 2021-07, basically its inclusive of the current month plus the prior 2 for each calc.
Any ideas on how I could try to accomplish this using SQL Server? Thank you!
CodePudding user response:
You can achieve that using Ordered Analytical Function. The idea is to find cumulative sum.
For Pivoting the data from row to column, you can use CASE
and MAX
.
select
Month
,Max(case when Type='low' then Mileage end) as Month_Low
,Max(case when Type='med' then Mileage end)as Month_Med
,Max(case when Type='high' then Mileage end) as Month_High
,case when ROW_NUMBER() over(order by Month asc)>=3 Then 1 end as is_3month_data_present
from #demo
group by month
Month | Month_Low | Month_Med | Month_High | is_3month_data_present |
---|---|---|---|---|
2021-05 | 5000 | 6789 | 5000 | null |
2021-06 | 15000 | 12876 | 1500 | null |
2021-07 | 3000 | 1578 | 2700 | 1 |
2021-08 | 3500 | 3500 | 2968 | 1 |
Below SQL should give the desired final result.
select
Month
,case when is_3month_data_present=1 then Sum(Month_Low Month_Med Month_High) over(order by Month ROWS BETWEEN 2 PRECEDING and CURRENT ROW ) end as ThreeMonth_TotalMileage
,case when is_3month_data_present=1 then Sum(Month_Low) over(order by Month ROWS BETWEEN 2 PRECEDING and CURRENT ROW ) end as ThreeMonth_Low
,case when is_3month_data_present=1 then Sum(Month_Med ) over(order by Month ROWS BETWEEN 2 PRECEDING and CURRENT ROW ) end as ThreeMonth_Med
,case when is_3month_data_present=1 then Sum(Month_High) over(order by Month ROWS BETWEEN 2 PRECEDING and CURRENT ROW ) end as ThreeMonth_High
from
(select
Month
,Max(case when Type='low' then Mileage end) as Month_Low
,Max(case when Type='med' then Mileage end)as Month_Med
,Max(case when Type='high' then Mileage end) as Month_High
,case when ROW_NUMBER() over(order by Month asc)>=3 Then 1 end as is_3month_data_present
from #demo
group by Month
)stg
Output:
Month | ThreeMonth_TotalMileage | ThreeMonth_Low | ThreeMonth_Med | ThreeMonth_High |
---|---|---|---|---|
2021-05 | null | null | null | |
2021-06 | null | null | null | |
2021-07 | 53443 | 23000 | 21243 | 9200 |
2021-08 | 46622 | 21500 | 17954 | 7168 |
Note: Expected output provided in the question is wrong as per the dataset given. ThreeMonth_low is wrong, rest all are correct. For 2021-07, actual value should be 23000
instead of 20300
.
DB Fiddle: Try it here