Home > front end >  Mean of a column only for observations meeting a condition
Mean of a column only for observations meeting a condition

Time:11-06

How to to add a column with the mean of z for each group "y" for values where if x < 10 for any other case the mean column can take the value of z

df <- data.frame(y = c(LETTERS[1:5], LETTERS[1:5],LETTERS[3:7]), x = 1:15, z = c(4:9,1:4,2:6))
   y  x z
1  A  1 4
2  B  2 5
3  C  3 6
4  D  4 7
5  E  5 8
6  A  6 9
7  B  7 1
8  C  8 2
9  D  9 3
10 E 10 4
11 C 11 2
12 D 12 3
13 E 13 4
14 F 14 5

I am trying something like

df %>% group_by(y) %>%
  mutate(gr.mean = mean(z))

but this provide the mean for any case of x

CodePudding user response:

We can subset the 'z' with a logical condition on 'x'

library(dplyr)
df %>%
     group_by(y) %>%
      mutate(gr.mean = if(all(x >=10)) z else mean(z[x < 10])) %>%
      ungroup

-output

# A tibble: 15 × 4
   y         x     z gr.mean
   <chr> <int> <int>   <dbl>
 1 A         1     4     6.5
 2 B         2     5     3  
 3 C         3     6     4  
 4 D         4     7     5  
 5 E         5     8     8  
 6 A         6     9     6.5
 7 B         7     1     3  
 8 C         8     2     4  
 9 D         9     3     5  
10 E        10     4     8  
11 C        11     2     4  
12 D        12     3     5  
13 E        13     4     8  
14 F        14     5     5  
15 G        15     6     6  

Or without if/else

df %>%
     group_by(y) %>%
      mutate(gr.mean = coalesce(mean(z[x < 10]), z))
  • Related