How to do the mean of n observations (repeated measurements of the same sample)-CodePudding

I have a big dataframe similar to this one:

df <- data.frame(sample=c('s1a', 's1b', 's2a', 's2b', 's3a', 's3b'), Mg=1:6, P=7:12, K=3:8)

where "a" and "b" are repeated measurements of the same samples. I would like to obtain a new df with the mean for each measurements per sample (s1, s2, s3) and obtain something like this:

df_new <- data.frame(sample=c('s1', 's2', etc..), Mg=1.5, etc.., P=7.5, etc.., K=3.5, etc)

CodePudding user response：

You can use aggregate and use sub to remove a and b.

aggregate(. ~ sample, transform(df, sample = sub("[ab]$", "", sample)), mean)
#aggregate(. ~ sample, within(df, sample <- sub("[ab]$", "", sample)), mean) #Alternative
#aggregate(df[-1], list(sample=sub("[ab]$", "", df[,1])), mean) #Alternative
#  sample  Mg    P   K
#1     s1 1.5  7.5 3.5
#2     s2 3.5  9.5 5.5
#3     s3 5.5 11.5 7.5

CodePudding user response：

library(tidyverse) 

df %>% 
  group_by(sample = str_extract(sample, ".{0,2}")) %>% 
  summarise(across(everything(), mean))

# A tibble: 3 × 4
  sample    Mg     P     K
  <chr>  <dbl> <dbl> <dbl>
1 s1       1.5   7.5   3.5
2 s2       3.5   9.5   5.5
3 s3       5.5  11.5   7.5