I am looking to compare monthly sales data in one month to the following 3 months. For reference, i want to compare the sales rates in January to Feb, March and April sales and see the % growth or decline. I then want to do this for May over June/july/aug and then sept over OCT/NOV/DEC.
Example data i have is below. Not all SKUs sell year round (We dont sell Halloween branded items in March etc) so there is not 12 months of data for all SKUs.
SKU | Sales | Month_number | Month
561 2 1 Jan
561 2 2 Feb
561 6 3 MAR
561 6 4 APR
191 5 8 AUG
191 10 9 SEP
191 10 10 OCT
191 12 11 NOV
.. ETC
Right now im using a lag statement to calculate month over month changes but cant think of a creative solution for calculating Jan vs March for a large dataset.
My current solution is to do the following
Data want;
Set have;
if month = 'Jan' then JanSales = Sales;
if month = 'MAR' then MarSales = sales;
MarJan = Marchsales/Jansales;
format Marjan percent8.2;
run;
This would create a new column called MarJan that is = 300%. in the case above. Now i could easily do that for every month but is there a cleaner, more efficient way to do this?
Thanks!
CodePudding user response:
We can do this by using a counter. On the first month, we'll store the sales value of the month. We'll then compare each subsequent month after that. For example, let's call n
our counter and compare_month_sales
the sales from the month we want to compare.
sales month n compare_month_sales
2 Jan 1 2
2 Feb 2 2
6 MAR 3 2
6 APR 4 2
5 AUG 1 5
10 SEP 2 5
...
When it's between 2 and 4, we have months that we want to compare to. Let's put this logic into action using by-groups. It looks like your data is sorted in a particular order, so we'll use the notsorted
option.
data want;
set have;
retain compare_month_sales; /* Do not reset this variable at each row */
by sku month_number notsorted;
n 1;
/* Create a counter for the number of months that have passed.
If we've incremented to the 5th month, reset the counter */
if(n = 5 OR first.sku) then n = 1;
/* On the first month, get the comparison sale value */
if(n = 1) then compare_month_sales = sales;
/* Between The 2nd, 3rd, and 4th months will compare to the first */
if(1 < n < 5) then pct = sales/compare_month_sales;
format pct percent8.;
drop n compare_month_sales;
run;
Output:
sku sales month_number month pct
561 2 1 Jan .
561 2 2 Feb 100%
561 6 3 MAR 300%
561 6 4 APR 300%
191 5 8 AUG .
191 10 9 SEP 200%
191 10 10 OCT 200%
191 12 11 NOV 240%