I have a dataframe
stim1 stim2 stim_time feedback_time feedback RT
1 2 3 1607325455244 1607325458496 1 3252
2 1 3 1607325462034 1607325464723 1 2689
3 2 1 1607325474049 1607325476997 1 2948
4 1 2 1607325480994 1607325483135 1 2141
5 3 1 1607325492735 1607325495914 1 3179
6 3 2 1607325498941 1607325501287 1 2346
My objective is to have (in an efficient manner) additional columns which calculate the cumulative RT (column6) spent on whatever is in stim1 and stim2.
For example
in row 4
stim1=1. Previously it was shown in trial 2 and trial 3.
so left_time_spent(4)=2689 (RT(2)) 2948 RT(3) =5637
stim2 in row 4=2. 2 was shown in trial 1 and trial 3 with RT 3252 2948=6200
so right_time_spent(4)=6200
desired output
stim1 stim2 stim_time feedback_time feedback RT left_time_spent
1: 2 3 1607325455244 1607325458496 1 3252 0
2: 1 3 1607325462034 1607325464723 1 2689 0
3: 2 1 1607325474049 1607325476997 1 2948 3252
4: 1 2 1607325480994 1607325483135 1 2141 5637
right_time_spent
1: 0
2: 3252
3: 2689
4: 6200
sharing shortened dput object as well. Thank you so much for all your help!
structure(list(stim1 = c(2L, 1L, 2L, 1L), stim2 = c(3L, 3L, 1L,
2L), stim_time = structure(c(7.94124289122202e-312, 7.94124292476908e-312,
7.94124298413107e-312, 7.94124301844393e-312), class = "integer64"),
feedback_time = structure(c(7.94124290728904e-312, 7.94124293805451e-312,
7.94124299869612e-312, 7.94124302902187e-312), class = "integer64"),
feedback = c(1L, 1L, 1L, 1L), RT = structure(c(1.60670148027573e-320,
1.32854252166711e-320, 1.45650552393999e-320, 1.05779454774611e-320
), class = "integer64"), left_time_spent = c(0, 0, 3252,
5637), right_time_spent = c(0, 3252, 2689, 6200)), class = c("data.table",
"data.frame"), row.names = c(NA, -4L), .internal.selfref = <pointer: 0x7f8c2600e4e0>)
CodePudding user response:
An Rcpp
solution:
dualcumsum <- Rcpp::cppFunction('
List dualcumsum(const IntegerVector& x, const IntegerVector& y, const IntegerVector& val, const int& numxy) {
const int n = x.size();
IntegerVector csum(numxy 1);
IntegerVector out1(n);
IntegerVector out2(n);
for (int i = 0; i < n; i ) {
out1(i) = csum(x(i));
out2(i) = csum(y(i));
csum(x(i)) = val(i);
csum(y(i)) = val(i);
}
return(List::create(out1, out2));
}
')
Example use on a simplified table:
library(data.table)
dt <- data.table(stim1 = c(2L, 1L, 2L, 1L),
stim2 = c(3L, 3L, 1L, 2L),
RT = c(3252L, 2689L, 2948L, 2141L))
dt[, c("left_time_spent", "right_time_spent") := dualcumsum(stim1, stim2, as.integer(RT), max(c(stim1, stim2)))][]
> dt[, c("left_time_spent", "right_time_spent") := dualcumsum(stim1, stim2, as.integer(RT), max(c(stim1, stim2)))][]
stim1 stim2 RT left_time_spent right_time_spent
1: 2 3 3252 0 0
2: 1 3 2689 0 3252
3: 2 1 2948 3252 2689
4: 1 2 2141 5637 6200
If stim1
and stim2
aren't 1- or 0-based integers, use this instead:
stims <- with(dt, unique(c(stim1, stim2)))
dt[, c("left_time_spent", "right_time_spent") := dualcumsum(match(stim1, stims), match(stim2, stims), as.integer(RT), max(stims))]