In R, trying to average one column based on selecting a certain value in another column-CodePudding

In R, I'm trying to average a subset of a column based on selecting a certain value (ID) in another column. Consider the example of choosing an ID among 100 IDs, perhaps the ID number being 5. Then, I want to average a subset of values in another column that corresponds to the ID number that is 5. Then, I want to do the same thing for the rest of the IDs. What should this function be?

CodePudding user response：

Using dplyr:

library(dplyr)

dt <- data.frame(ID = rep(1:3, each=3), values = runif(9, 1, 100))

dt %>% 
  group_by(ID) %>% 
  summarise(avg = mean(values))

Output:

     ID   avg
  <int> <dbl>
1     1  41.9
2     2  79.8
3     3  39.3

Data:

  ID    values
1  1  8.628964
2  1 99.767843
3  1 17.438596
4  2 79.700918
5  2 87.647472
6  2 72.135906
7  3 53.845573
8  3 50.205122
9  3 13.811414

CodePudding user response：

We can use a group by mean. In base R, this can be done with aggregate

dt <- data.frame(ID = rep(1:3, each=3), values = runif(9, 1, 100))

aggregate(values ~ ID, dt, mean)

Output:

  ID   values
1  1 40.07086
2  2 53.59345
3  3 47.80675