I am currently developing my pipeline in R for data processing/analysis.
My data is in a long format (sample rate = 1000Hz). Throughout the dataframe I have added a trialNum variable for each trial, but I am having issues reshaping my data to wide.
What I am trying to do, and I think should be possible with a for loop or two... Is to get the average value of x at index 1:100, based on the trialNum.
Here is a simple version...
Pupil Size | TrialNum |
---|---|
500 | 1 |
502 | 1 |
504 | 1 |
506 | 1 |
508 | 1 |
507 | 2 |
508 | 2 |
510 | 2 |
511 | 2 |
512 | 2 |
513 | 3 |
515 | 3 |
514 | 3 |
512 | 3 |
515 | 3 |
So stated simply... I would get the first index of Pupil size for each TrialNum, and average together, and add to a new variable (average_pupil_size).
In this example, each trial has 5 inputs, so I would end up with a variable output of length = 5...
average_size <- c(507, 508, 509, 510, 512)
I could then plot this signal for all my trials... I hope I have explained myself clearly... Apologies for the chaos that is my mind.
Does anyone know how to do this? It is a bit beyond me.
Thanks in advance!
CodePudding user response:
We could add an index within each TrialNum using row_number()
, and then group-summarize within those.
library(dplyr)
df %>%
group_by(TrialNum) %>%
mutate(index = row_number()) %>%
group_by(index) %>%
summarize(avg = mean(Pupil.Size))
Result
# A tibble: 5 × 2
index avg
<int> <dbl>
1 1 507.
2 2 508.
3 3 509.
4 4 510.
5 5 512.
CodePudding user response:
in base R, if the data has same length for each trial, eg in this case 5, we can do:
rowMeans(unstack(df))
[1] 506.6667 508.3333 509.3333 509.6667 511.6667