I really need your help, any support will be much appreciated.
This is the example data:
tibble::tribble(
~Item, ~Amount, ~Date,
"saw", 10L, "01/01/2020",
"saw", 20L, "04/01/2020",
"saw", 5L, "03/06/2020",
"saw", 30L, "02/07/2020",
"nails", 300L, "02/02/2020",
"nails", 200L, "04/03/2020",
"nails", 250L, "05/03/2020",
"nails", 400L, "10/05/2020"
)
I am trying to create a new column next to date that captures the percent change in amount for each item from one date to the next.
The result would look like:
tibble::tribble(
~Item, ~Amount, ~Date, ~`%.change`,
"saw", 10L, "01/01/2020", "0%",
"saw", 20L, "04/01/2020", "100%",
"saw", 5L, "03/06/2020", "-75%",
"saw", 30L, "02/07/2020", "500%",
"nails", 300L, "02/02/2020", "0%",
"nails", 200L, "04/03/2020", "-33%",
"nails", 250L, "05/03/2020", "25%",
"nails", 400L, "10/05/2020", "60%"
)
The formula for percent change: (new observation- original observation)/original observation * 100
And for the first date for each item, when there still hasn't been any change value of 0 or empty cell.
Again any help will be much appreciated, and huge thanks in advance!!
CodePudding user response:
You can group by the Item
column then, to get the "original observation" for your calculation, use dplyr::lag
. This will make the first item in each group NA
, so use an ifelse
to change it to 0%.
library(dplyr)
df %>%
group_by(Item) %>%
mutate(`%.change` = scales::percent((Amount - lag(Amount))/lag(Amount)),
`%.change` = ifelse(is.na(`%.change`), "0%", `%.change`))
#> # A tibble: 8 x 4
#> # Groups: Item [2]
#> Item Amount Date `%.change`
#> <chr> <int> <chr> <chr>
#> 1 saw 10 01/01/2020 0%
#> 2 saw 20 04/01/2020 100%
#> 3 saw 5 03/06/2020 -75%
#> 4 saw 30 02/07/2020 500%
#> 5 nails 300 02/02/2020 0%
#> 6 nails 200 04/03/2020 -33%
#> 7 nails 250 05/03/2020 25%
#> 8 nails 400 10/05/2020 60%