Fairly new to R, ended up in the following situation: I want to create a summary row for each group in the dataframe based on Year
and Model
, where a value
of each row would be based on the subtraction of value
of one Variable
from others in the group.
df <- data.frame(Model = c(1,1,1,2,2,2,2,2,2,2,2,2,2),
Year = c(2020, 2020, 2020, 2020, 2020, 2020, 2020, 2030, 2030, 2030, 2040, 2040, 2040),
Variable = c("A", "B", "C", "A", "B", "C", "D", "A", "C", "E", "A", "C", "D"),
value = c(15, 2, 5, 25, 6, 4, 4, 41, 24,1, 15, 3, 2))
I have managed to create a new row for each group, so it already has a Year
and a Variable
name that I manually specified using:
df <- df %>% group_by(Model, Year) %>% group_modify(~ add_row(., Variable = "New", .before=0))
However, I am struggling to create an equation from which I want to calculate the value
.
What I want to have instead of NAs: value
of A-B-D in each group
Would appreciate any help. My first thread here, pardon for any inconvenience.
CodePudding user response:
You could pivot wide and then back; this would add rows with zeros where missing:
library(dplyr); library(tidyr)
df %>%
pivot_wider(names_from = Variable, values_from = value, values_fill = 0) %>%
mutate(new = A - B - D) %>%
pivot_longer(-c(Model, Year), names_to = "Variable")
# A tibble: 24 × 4
Model Year Variable value
<dbl> <dbl> <chr> <dbl>
1 1 2020 A 15
2 1 2020 B 2
3 1 2020 C 5
4 1 2020 D 0
5 1 2020 E 0
6 1 2020 new 13 # 15 - 2 - 0 = 13
7 2 2020 A 25
8 2 2020 B 6
9 2 2020 C 4
10 2 2020 D 4
# … with 14 more rows
EDIT - variation where we leave the missing values and use coalesce(x, 0)
to allow subtraction to treat NA's as zeroes. The pivot_wider
creates NA's in the missing spots, but we can exclude these in the pivot_longer
using values_drop_na = TRUE
.
df %>%
pivot_wider(names_from = Variable, values_from = value) %>%
mutate(new = A - coalesce(B,0) - coalesce(D,0)) %>%
pivot_longer(-c(Model, Year), names_to = "Variable", values_drop_na = TRUE)
# A tibble: 17 × 4
Model Year Variable value
<dbl> <dbl> <chr> <dbl>
1 1 2020 A 15
2 1 2020 B 2
3 1 2020 C 5
4 1 2020 new 13
5 2 2020 A 25
6 2 2020 B 6
7 2 2020 C 4
8 2 2020 D 4
9 2 2020 new 15
10 2 2030 A 41
11 2 2030 C 24
12 2 2030 E 1
13 2 2030 new 41
14 2 2040 A 15
15 2 2040 C 3
16 2 2040 D 2
17 2 2040 new 13