Home > Blockchain >  tidyr mutate new column based on group by with calculation
tidyr mutate new column based on group by with calculation

Time:11-02

Using tidyr, how can I create a new column through a group-by and calculation?

For example, if I have this dataframe:

name <- c("a", "a", "a", "a", "b", "b", "b", "b")
x1 <- c(0, 0, 0, 0, 1, 1, 1, 1)
x2 <- c(15, 15, 15, 15, 15, 15, 15, 15)
y <- c(1, 2, 1, 2, 1, 2, 1, 2)
z <- c(50, 100, 40, 90, 65, 95, 40, 95)

df <- data.frame(name, x1, x2, y, z)

Let's say I want to (1) group-by x1 and x2; (2) find the max z value in that group; and (3) create a new column z2 that normalized z by that maximum.

enter image description here

So in this case, the expected output for z2 is c(0.5, 1, 0.4, 0.9, 0.684, 1, 0.421, 1).

CodePudding user response:

We could simply group by 'x1', 'x2' and create the column with mutate

library(dplyr)
df <- df %>%
    group_by(x1, x2) %>%
    mutate(z2 = (z/max(z, na.rm = TRUE))) %>%
    ungroup

-output

df
# A tibble: 8 × 6
  name     x1    x2     y     z    z2
  <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 a         0    15     1    50 0.5  
2 a         0    15     2   100 1    
3 a         0    15     1    40 0.4  
4 a         0    15     2    90 0.9  
5 b         1    15     1    65 0.684
6 b         1    15     2    95 1    
7 b         1    15     1    40 0.421
8 b         1    15     2    95 1    
  • Related