Home > Blockchain >  Mean values from selected row groups
Mean values from selected row groups

Time:01-27

I would like to determine the mean of selected rows (e.g. Mean values for the compounds "PosCtrl", "Ab1", "Ab1_gl", "Ab2", etc in the following data frame:

structure(list(Compounds = c("PosCtrl", "PosCtrl", "PosCtrl", 
"PosCtrl", "Ab1", "Ab1", "Ab1", "Ab1", "Ab1", "Ab1_gl", "Ab1_gl", 
"Ab1_gl", "Ab1_gl", "Ab1_gl", "Ab1_gl", "Ab2", "Ab2", "Ab2", 
"Ab2", "Ab2", "Ab2", "Ab3", "Ab3", "Ab3", "Ab3", "Ab3", "Ab4", 
"Ab4", "Ab4", "Ab4", "Ab5", "Ab5", "Ab5", "Ab5", "Ab5", "negctrl", 
"negctrl", "negctrl", "negctrl", "negctrl"), Values = c(7.77, 
5.78, 7.01, 7.23, 0.99, 0.91, 1.23, 0.86, 0.93, 0.76, 0.89, 0.58, 
0.8, 0.76, 0.46, 0.91, 0.8, 0.91, 1, 0.64, 0.75, 0.89, 0.87, 
0.77, 0.89, 0.91, 0.82, 1.33, 1.14, 1.44, 1.03, 1.02, 0.88, 0.99, 
1.1, 0.76, 0.68, 0.93, 0.84, 0.8)), class = "data.frame", row.names = c(NA, 
40L))

and then I would like to generate a table with the mean values. I have multiple dataframes with thousands of values per categories (compounds).

This is what I would like to obtain per data frame:

structure(list(PosCtrl = 6.95, Ab1 = 0.98, Ab1_gl = 0.71, Ab2 = 0.83, 
    Ab3 = 0.86, Ab4 = 1.19, Ab5 = 1, negctrl = 0.8), class = "data.frame", row.names = "Mean")

This are the codes that I have tried, but it returns error (x must be a numeric)

Data1$mean <-
  with (Data1, ave( Values, findInterval(Compounds, c(PosCtrl, Ab1, Ab1_gl,Ab2,Ab3, Ab4, Ab5, negctrl)), FUN= mean))

Many thanks.

CodePudding user response:

A tidyverse solution

df %>%  
  group_by(Compounds) %>%  
  summarise(mean = mean(Values)) %>% 
  pivot_wider(names_from = Compounds, values_from = mean)

# A tibble: 1 × 8
    Ab1 Ab1_gl   Ab2   Ab3   Ab4   Ab5 negctrl PosCtrl
  <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>   <dbl>
1 0.984  0.708 0.835 0.866  1.18  1.00   0.802    6.95

CodePudding user response:

You can use aggregate instead of ave

> aggregate(Values ~ Compounds, FUN=mean, na.rm=TRUE, data=dat)
  Compounds    Values
1       Ab1 0.9840000
2    Ab1_gl 0.7083333
3       Ab2 0.8350000
4       Ab3 0.8660000
5       Ab4 1.1825000
6       Ab5 1.0040000
7   negctrl 0.8020000
8   PosCtrl 6.9475000

tapply is also al good alternative:

> with(dat, tapply(Values, Compounds, mean, na.rm=TRUE))
      Ab1    Ab1_gl       Ab2       Ab3       Ab4       Ab5   negctrl   PosCtrl 
0.9840000 0.7083333 0.8350000 0.8660000 1.1825000 1.0040000 0.8020000 6.9475000 
  • Related