Below is the sample data
indcode <- c(71,72,81,82,99,000000,71,72,81,82,99,000000)
year <- c(2020,2020,2020,2020,2020,2020,2021,2021,2021,2021,2021,2021)
employment <- c(3,5,7,9,2,26,4,6,8,10,3,31)
test <- data.frame(indcode, year, employment)
The task is to create a new column that would the 000000 value for each year. I know that this involves a pivot wider but how to get the 000000 to repeat is my struggle. Below is the desired result. Hoping to not have 000000 (Total, all industries) be a row. It would be essentially be a duplicate.
Year indcode employment total
2020 71 3 26
2020 72 5 26
2020 81 7 26
2020 82 9 26
2020 99 2 26
2021 71 4 31
and so on...
CodePudding user response:
We could do this by detecting one or more zeros (
) from the start (^
) to the end ($
) of the string in 'indcode' to subset the 'employment' for each 'year' (grouped) to create a new column and then filter
out the 0 rows
library(dplyr)
library(stringr)
test %>%
group_by(year) %>%
mutate(total = employment[str_detect(indcode, '^0 $')]) %>%
ungroup %>%
filter(str_detect(indcode, "^0 $", negate = TRUE))