Home > database >  Growth Rates in R with dplyr not working, NAs generated - Help needed
Growth Rates in R with dplyr not working, NAs generated - Help needed

Time:09-30

I'm trying to generate growth rates for each fiscal half for various products in R using dplyr and the lag function.

Usually, this works for me. However this time it's generating NAs. I'm not sure what the issue is. The following code generates "NA" for all growth rates. Hoping Someone can help.

library(flexdashboard)
library(dplyr)
library(magrittr)
library(scales)
library(sqldf)
library(ggplot2)
library(lubridate)
library(knitr)
library(tidyr)
library(kableExtra)
library(ggrepel)
library(htmltools)
library(stringr)
library(readxl)

t <- c(3000,2000, 6000)
u <- c("FY18H1", "FY18H2", "FY19H1", "FY19H2", "FY20H1", "FY20H2")
x <- c(1,2,3,4,5)
y <- c("a","b","c","d","e")
z <- c("apples","oranges")
identifer <- sort(c(replicate(x,n =6)))
name <- sort(c(replicate(y,n=6)))
business <- sort(c(replicate(z,n=15)))
half <- c(replicate(u, n=5))
dollars <- c(replicate(t, n = 10))

df <- data.frame(identifer,name, business,half, dollars)

df <- df %>% group_by(
  identifer,
  name,
  business,
  half
) %>%
  mutate(
    YoY_GROWTH_DOLLARS = dollars - lag(dollars, 2),
    YoY_GROWTH_RATE = round(YoY_GROWTH_DOLLARS/lag(dollars,2),4)
  )

CodePudding user response:

Instead of using dplyr::mutate, use plyr::mutate

df %>% dplyr::group_by(
  identifer,
  name,
  business,
  half
) %>%
  plyr::mutate(
    YoY_GROWTH_DOLLARS = dollars - lag(dollars, 2),
    YoY_GROWTH_RATE = round(YoY_GROWTH_DOLLARS/lag(dollars,2),4)
  )

   identifer name  business half   dollars YoY_GROWTH_DOLLARS YoY_GROWTH_RATE
       <dbl> <chr> <chr>    <chr>    <dbl>              <dbl>           <dbl>
 1         1 a     apples   FY18H1    3000                 NA          NA    
 2         1 a     apples   FY18H2    2000                 NA          NA    
 3         1 a     apples   FY19H1    6000               3000           1    
 4         1 a     apples   FY19H2    3000               1000           0.5  
 5         1 a     apples   FY20H1    2000              -4000          -0.667
 6         1 a     apples   FY20H2    6000               3000           1    
 7         2 b     apples   FY18H1    3000               1000           0.5  
 8         2 b     apples   FY18H2    2000              -4000          -0.667
 9         2 b     apples   FY19H1    6000               3000           1    
10         2 b     apples   FY19H2    3000               1000           0.5 

CodePudding user response:

I think you should not group_by half. Try -

library(dplyr)

df %>% group_by(
  identifer,
  name,
  business
) %>%
  mutate(
    YoY_GROWTH_DOLLARS = dollars - lag(dollars, 2),
    YoY_GROWTH_RATE = round(YoY_GROWTH_DOLLARS/lag(dollars,2),4)
  ) %>% ungroup

#   identifer name  business half   dollars YoY_GROWTH_DOLLARS YoY_GROWTH_RATE
#       <dbl> <chr> <chr>    <chr>    <dbl>              <dbl>           <dbl>
# 1         1 a     apples   FY18H1    3000                 NA          NA    
# 2         1 a     apples   FY18H2    2000                 NA          NA    
# 3         1 a     apples   FY19H1    6000               3000           1    
# 4         1 a     apples   FY19H2    3000               1000           0.5  
# 5         1 a     apples   FY20H1    2000              -4000          -0.667
# 6         1 a     apples   FY20H2    6000               3000           1    
# 7         2 b     apples   FY18H1    3000                 NA          NA    
# 8         2 b     apples   FY18H2    2000                 NA          NA    
# 9         2 b     apples   FY19H1    6000               3000           1    
#10         2 b     apples   FY19H2    3000               1000           0.5  
# … with 20 more rows
  • Related