I have several columns with reaction times (RT) per participant. This is a simplified version of the dataframe I have (in reality I have 20 RT columns):
df <- data.frame(id=c(1,1,1,5,5,5), RT_0=c(0.467,NA,0.634,0.416,NA,0.35), RT_1=c(0.3,0.35,0.301,0.267,0.3,0.449), RT_2=c(1.251,NA, 0.334,0.34, 0.317,0.42), RT_3=c(0.284,0.316,0.283,0.367,0.55,0.35),pos=c(1,6,2,4,2,6))
id RT_0 RT_1 RT_2 RT_3 pos
1 0.467 0.3 1.251 0.284 1
1 NA 0.35 NA 0.316 6
1 0.634 0.301 0.334 0.283 2
5 0.416 0.267 0.34 0.367 4
5 NA 0.3 0.317 0.55 2
5 0.35 0.449 0.42 0.35 6
I want to calculate the overall mean of the RT columns by id. For example, the mean of all reaction times of participant nº1 was 0.452.
Something like this:
id RT_0 RT_1 RT_2 RT_3 pos mean
1 0.467 0.3 1.251 0.284 1 0.452
1 NA 0.35 NA 0.316 6 0.452
1 0.634 0.301 0.334 0.283 2 0.452
5 0.416 0.267 0.34 0.367 4 0.375
5 NA 0.3 0.317 0.55 2 0.375
5 0.35 0.449 0.42 0.35 6 0.375
CodePudding user response:
We may group by 'id', select the columns that starts_with
'RT_' in column names within across
(or pick
), unlist
the columns to a vector
and get the overall mean with mean
and assign it as new column in mutate
library(dplyr)
df %>%
group_by(id) %>%
mutate(mean = mean(unlist(across(starts_with("RT_"))),
na.rm = TRUE)) %>%
ungroup
-output
# A tibble: 6 × 7
id RT_0 RT_1 RT_2 RT_3 pos mean
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 0.467 0.3 1.25 0.284 1 0.452
2 1 NA 0.35 NA 0.316 6 0.452
3 1 0.634 0.301 0.334 0.283 2 0.452
4 5 0.416 0.267 0.34 0.367 4 0.375
5 5 NA 0.3 0.317 0.55 2 0.375
6 5 0.35 0.449 0.42 0.35 6 0.375
Or with data.table
library(data.table)
setDT(df)[, mean := mean(unlist(.SD), na.rm = TRUE),
.(id), .SDcols = patterns("^RT_")]
-output
> df
id RT_0 RT_1 RT_2 RT_3 pos mean
1: 1 0.467 0.300 1.251 0.284 1 0.4520000
2: 1 NA 0.350 NA 0.316 6 0.4520000
3: 1 0.634 0.301 0.334 0.283 2 0.4520000
4: 5 0.416 0.267 0.340 0.367 4 0.3750909
5: 5 NA 0.300 0.317 0.550 2 0.3750909
6: 5 0.350 0.449 0.420 0.350 6 0.3750909