Using tidyr, how can I create a new column through a group-by and calculation?
For example, if I have this dataframe:
name <- c("a", "a", "a", "a", "b", "b", "b", "b")
x1 <- c(0, 0, 0, 0, 1, 1, 1, 1)
x2 <- c(15, 15, 15, 15, 15, 15, 15, 15)
y <- c(1, 2, 1, 2, 1, 2, 1, 2)
z <- c(50, 100, 40, 90, 65, 95, 40, 95)
df <- data.frame(name, x1, x2, y, z)
Let's say I want to (1) group-by x1
and x2
; (2) find the max z
value in that group; and (3) create a new column z2
that normalized z
by that maximum.
So in this case, the expected output for z2
is c(0.5, 1, 0.4, 0.9, 0.684, 1, 0.421, 1).
CodePudding user response:
We could simply group by 'x1', 'x2' and create the column with mutate
library(dplyr)
df <- df %>%
group_by(x1, x2) %>%
mutate(z2 = (z/max(z, na.rm = TRUE))) %>%
ungroup
-output
df
# A tibble: 8 × 6
name x1 x2 y z z2
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 a 0 15 1 50 0.5
2 a 0 15 2 100 1
3 a 0 15 1 40 0.4
4 a 0 15 2 90 0.9
5 b 1 15 1 65 0.684
6 b 1 15 2 95 1
7 b 1 15 1 40 0.421
8 b 1 15 2 95 1