Below is the sample code. The task at hand is to create a year over year difference (2021 q4 value - 2020 q4 value) for only the fourth quarter and percentage difference. Desired result is below. Usually I would do a pivot_wider and such. However, how does one do this and not take all quarters into account?
year <- c(2020,2020,2020,2020,2021,2021,2021,2021,2020,2020,2020,2020,2021,2021,2021,2021)
qtr <- c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4)
area <- c(1012,1012,1012,1012,1012,1012,1012,1012,1402,1402,1402,1402,1402,1402,1402,1402)
employment <- c(100,102,104,106,108,110,114,111,52,54,56,59,61,66,65,49)
test1 <- data.frame (year,qtr,area,employment)
area difference percentage
1012 5 4.7%
1402 -10 -16.9
CodePudding user response:
You would use filter
on quarter:
test1 |>
filter(qtr == 4) |>
group_by(area) |>
mutate(employment_lag = lag(employment),
diff = employment - employment_lag) |>
na.omit() |>
ungroup() |>
mutate(percentage = diff/employment_lag)
Output:
# A tibble: 2 × 7
year qtr area employment diff employment_start percentage
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2021 4 1012 111 5 106 0.0472
2 2021 4 1402 49 -10 59 -0.169
Update: Adding correct percentage.