I have a data frame of 36 experimentees, who each executed 500 Trials in an experiment and because of this size i would like to make a function of some sort for my problem.
An exemplary data frame could look like this:
data <- data.frame(ID = c(1, 1, 1, 2, 2, 2, 3, 3, 3),
trial = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
value = c(6, 8, 2, 5, 7, 2, 8, 9, 2))
"ID" is the identifier for the experimentee, "trial" is obviously the trial which was executed and "value" is a numeric value (in the exemplary data frame just random numbers).
I want to create a new column and for every experimentee I want the value of a trial subtracted by the value of the subsequent trial.
So value of trial 1 minus value of trial 2; value of trial 2 minus value of trial 3 and so on..
basically "value(trial(n)) - value(trial(n 1))".
It doesn´t seem to be a very difficult problem and I still can´t figure it out. Any help would be great, I am fairly new to R.
CodePudding user response:
You can group_by()
the experimenteer "ID" and then calculate your value using the dplyr::lead()
function, i.e.
library(dplyr)
data %>%
group_by(ID) %>%
mutate(new_var = value - dplyr::lead(value))
Output:
# A tibble: 9 x 4
# Groups: ID [3]
ID trial value new_var
<dbl> <dbl> <dbl> <dbl>
1 1 1 6 -2
2 1 2 8 6
3 1 3 2 NA
4 2 1 5 -2
5 2 2 7 5
6 2 3 2 NA
7 3 1 8 -1
8 3 2 9 7
9 3 3 2 NA
CodePudding user response:
data.table
option using shift
:
library(data.table)
setDT(data)
data[, new_var := value-shift(value, type ="lead"), by = ID][]
Output:
ID trial value new_var
1: 1 1 6 -2
2: 1 2 8 6
3: 1 3 2 NA
4: 2 1 5 -2
5: 2 2 7 5
6: 2 3 2 NA
7: 3 1 8 -1
8: 3 2 9 7
9: 3 3 2 NA