get mean in a column for group of IDs in another Column using aggregate-CodePudding

I am trying to create a new data frame for getting mean penalty score for each official number.

STATS<-UPDATED_PENALTY%>%
aggregate(UPDATED_PENALTY, by = list(UPDATED_PENALTY$OFFICIAL_NUMBER, UPDATED_PENALTY$PENALTY), FUN = mean)

but it is giving me the following error

Error in mean.default(X[[i]], ...) : 'trim' must be numeric of length one
In addition: There were 50 or more warnings (use warnings() to see the first 50)

CodePudding user response：

If you want the mean score for each official number, try:

Example data

set.seed(123)
updated_penalty <- data.frame(official_number = rep(1:5, each = 5),
                              penalty = rnbinom(25, mu = 5, size = 1.5))

BASE R

tapply(updated_penalty$penalty, updated_penalty$official_number, mean)

Output:

# 1   2   3   4   5 
# 5.0 3.8 1.4 4.2 5.4

If you want it in a data frame:

vals <- tapply(updated_penalty$penalty, updated_penalty$official_number, mean)
new_df <- data.frame(ref_id = rownames(vals),
                     mean_penalties = vals)

Output:

#   ref_id mean_penalties
# 1      1            5.0
# 2      2            3.8
# 3      3            1.4
# 4      4            4.2
# 5      5            5.4

DPLYR

updated_penalty %>% 
  group_by(official_number) %>% 
  summarize(mean = mean(penalty))

Output:

#   official_number  mean
#             <int> <dbl>
# 1               1   5  
# 2               2   3.8
# 3               3   1.4
# 4               4   4.2
# 5               5   5.4