I have a dataframe as the following:
DATE | Industry | Statistics | VALUE |
---|---|---|---|
2022-01-01 | Agriculture | Employment | 500 |
2022-01-01 | Agriculture | Unemployment | 200 |
2022-01-01 | Agriculture | Vacancies | 100 |
2022-01-01 | Construction | Employment | 800 |
2022-01-01 | Construction | Unemployment | 400 |
2022-01-01 | Construction | Vacancies | 100 |
I want to add new rows under Statistics called "UV.rate" which should be calculated by Unemployment divided by Vacancies sorted by DATE, then Industry
I tried this code, but it is not working.
newdf <- bind_rows(
df,
df %>%
summarise(VALUE = VALUE[Statistics == "Unemployment"] / VALUE[Statistics == "Vacancies"],
Statistics = "UV.rate")
) %>%
CodePudding user response:
You missed group_by
bind_rows(
df,
df %>%
group_by(DATE, Industry) %>%
summarise(VALUE = VALUE[Statistics == "Unemployment"] / VALUE[Statistics == "Vacancies"],
Statistics = "UV.rate")
)
DATE Industry VALUE Statistics
<chr> <chr> <dbl> <chr>
1 2022-01-01 Agriculture 2 UV.rate
2 2022-01-01 Agriculture 500 Employment
3 2022-01-01 Agriculture 200 Unemployment
4 2022-01-01 Agriculture 100 Vacancies
5 2022-01-01 Construction 4 UV.rate
6 2022-01-01 Construction 800 Employment
7 2022-01-01 Construction 400 Unemployment
8 2022-01-01 Construction 100 Vacancies
CodePudding user response:
A slightly shorter alternative is to reshape long to wide to long.
df %>%
pivot_wider(names_from = Statistics, values_from = VALUE) %>%
mutate(UV.rate = Unemployment / Vacancies) %>%
pivot_longer(-c(DATE, Industry), names_to = "Statistics", values_to = "VALUE")
## A tibble: 8 x 4
# DATE Industry Statistics VALUE
# <chr> <chr> <chr> <dbl>
#1 2022-01-01 Agriculture Employment 500
#2 2022-01-01 Agriculture Unemployment 200
#3 2022-01-01 Agriculture Vacancies 100
#4 2022-01-01 Agriculture UV.rate 2
#5 2022-01-01 Construction Employment 800
#6 2022-01-01 Construction Unemployment 400
#7 2022-01-01 Construction Vacancies 100
#8 2022-01-01 Construction UV.rate 4