I tried to calculate the quarterly growth rate in sales for different stores. However, I group by my data several times until it became the following status:
How can I generate the growth rate table based on this equation: (Q3-Q2)/Q2?
The code I programmed so far is as follows. Thank you.
Store | Quarter | Weekly_Sales |
---|---|---|
1 | Q2 | 60428109 |
1 | Q3 | 20253948 |
2 | Q2 | 74356864 |
2 | Q3 | 24303355 |
3 | Q2 | 15459190 |
3 | Q3 | 5298005 |
4 | Q2 | 79302989 |
4 | Q3 | 27796792 |
5 | Q2 | 12523263 |
5 | Q3 | 4163791 |
library("dplyr")
library("lubridate")
Walmart_data_set <- read.csv("Walmart_Store_sales.csv")
Walmart_data_set$Date <- as.Date(Walmart_data_set$Date, "%d-%m-%Y")
Walmart_data_set["Month"] <- month(Walmart_data_set$Date)
Walmart_data_set["Quarter"] <- quarters(Walmart_data_set$Date)
Walmart_data_set["Year"] <- format(Walmart_data_set$Date, format ="%Y")
Q23_2012_Sales<- filter(Walmart_data_set, Year == "2012" & Quarter == "Q3" | Quarter == "Q2")
Sales_Store_quarter = Q23_2012_Sales %>% group_by(Store, Quarter) %>%
summarise(Weekly_Sales = sum(Weekly_Sales),
.groups = 'drop')
CodePudding user response:
You can do it like this:
df %>%
arrange(Store, Quarter) %>%
group_by(Store) %>%
mutate(growth = (Weekly_Sales - lag(Weekly_Sales))/lag(Weekly_Sales))
Output:
Store Quarter Weekly_Sales growth
<dbl> <chr> <dbl> <dbl>
1 1 Q2 60428109 NA
2 1 Q3 20253948 -0.665
3 2 Q2 74356864 NA
4 2 Q3 24303355 -0.673
5 3 Q2 15459190 NA
6 3 Q3 5298005 -0.657
7 4 Q2 79302989 NA
8 4 Q3 27796792 -0.649
9 5 Q2 12523263 NA
10 5 Q3 4163791 -0.668
CodePudding user response:
Don't group by Quarter
.
library(dplyr)
dat %>%
arrange(Store, Quarter) %>%
group_by(Store) %>%
mutate(Growth = c(NA, diff(Weekly_Sales)) / dplyr::lag(Weekly_Sales)) %>%
ungroup()
# . >
# # A tibble: 10 x 4
# Store Quarter Weekly_Sales Growth
# <int> <chr> <int> <dbl>
# 1 1 Q2 60428109 NA
# 2 1 Q3 20253948 -0.665
# 3 2 Q2 74356864 NA
# 4 2 Q3 24303355 -0.673
# 5 3 Q2 15459190 NA
# 6 3 Q3 5298005 -0.657
# 7 4 Q2 79302989 NA
# 8 4 Q3 27796792 -0.649
# 9 5 Q2 12523263 NA
# 10 5 Q3 4163791 -0.668
This method assumes that you always have a Q2
for each Q3
. (The converse would be you have more history in your data, with some stores perhaps gapping a quarter or two.)
Data
dat <- structure(list(Store = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L), Quarter = c("Q2", "Q3", "Q2", "Q3", "Q2", "Q3", "Q2", "Q3", "Q2", "Q3"), Weekly_Sales = c(60428109L, 20253948L, 74356864L, 24303355L, 15459190L, 5298005L, 79302989L, 27796792L, 12523263L, 4163791L)), class = "data.frame", row.names = c(NA, -10L))