I have a dataframe as follows
head(data)
subject block trial timeLeft timeRight stim1 stim2 Chosen
1 1 13 0 0 0 2 1 2
2 1 13 1 0 1 3 2 2
3 1 13 3 0 0 3 1 1
4 1 13 4 2 0 2 3 3
5 1 13 6 1 1 1 3 1
6 1 13 7 2 2 2 1 1
...
454 1006 14 0 0 0 6 5 5
455 1006 14 1 0 0 6 4 6
456 1006 14 3 0 1 4 5 4
457 1006 14 4 1 1 4 5 4
458 1006 14 6 1 2 6 4 6
my objective is to group by subject and block and to only keep rows prior and including where both timeLeft and timeRight =0
in this case the output would be
subject block trial timeLeft timeRight stim1 stim2 Chosen
1 1 13 0 0 0 2 1 2
2 1 13 1 0 1 3 2 2
3 1 13 3 0 0 3 1 1
...
454 1006 14 0 0 0 6 5 5
455 1006 14 1 0 0 6 4 6
Thank you in advance!
here is the structure of the data
'data.frame': 64748 obs. of 8 variables:
$ subject : num 1 1 1 1 1 1 1 1 1 1 ...
$ block : int 13 13 13 13 13 13 13 13 13 13 ...
$ trial : int 0 1 3 4 6 7 9 10 12 13 ...
$ timeLeft : int 0 0 0 2 1 2 2 1 3 4 ...
$ timeRight: int 0 1 0 0 1 2 1 3 4 4 ...
$ stim1 : int 2 3 3 2 1 2 2 3 2 2 ...
$ stim2 : int 1 2 1 3 3 1 3 1 1 1 ...
$ Chosen : int 2 2 1 3 1 1 2 1 2 2 ...
CodePudding user response:
You may do this with the help of custom function -
library(dplyr)
select_rows <- function(timeLeft, timeRight) {
inds <- which(timeLeft == 0 & timeRight == 0)
if(length(inds) >= 2) inds[1]:inds[2]
else 0
}
data %>%
group_by(subject, block) %>%
slice(select_rows(timeLeft, timeRight)) %>%
ungroup
# subject block trial timeLeft timeRight stim1 stim2 Chosen
# <int> <int> <int> <int> <int> <int> <int> <int>
#1 1 13 0 0 0 2 1 2
#2 1 13 1 0 1 3 2 2
#3 1 13 3 0 0 3 1 1
#4 1006 14 0 0 0 6 5 5
#5 1006 14 1 0 0 6 4 6
If the data is huge you may also do this with data.table
-
library(data.table)
setDT(data)[, .SD[select_rows(timeLeft, timeRight)], .(subject, block)]
data
It is easier to help if you provide data in a reproducible format
data <- structure(list(subject = c(1L, 1L, 1L, 1L, 1L, 1L, 1006L, 1006L,
1006L, 1006L, 1006L), block = c(13L, 13L, 13L, 13L, 13L, 13L,
14L, 14L, 14L, 14L, 14L), trial = c(0L, 1L, 3L, 4L, 6L, 7L, 0L,
1L, 3L, 4L, 6L), timeLeft = c(0L, 0L, 0L, 2L, 1L, 2L, 0L, 0L,
0L, 1L, 1L), timeRight = c(0L, 1L, 0L, 0L, 1L, 2L, 0L, 0L, 1L,
1L, 2L), stim1 = c(2L, 3L, 3L, 2L, 1L, 2L, 6L, 6L, 4L, 4L, 6L
), stim2 = c(1L, 2L, 1L, 3L, 3L, 1L, 5L, 4L, 5L, 5L, 4L), Chosen = c(2L,
2L, 1L, 3L, 1L, 1L, 5L, 6L, 4L, 4L, 6L)), class = "data.frame", row.names =
c("1", "2", "3", "4", "5", "6", "454", "455", "456", "457", "458"))
CodePudding user response:
If you want to keep all rows before timeLeft
and timeRight
are 0
, you can try this way.
Data
subject block trial timeLeft timeRight stim1 stim2 Chosen
1 1 13 0 0 0 2 1 2
2 1 13 1 0 1 3 2 2
3 1 13 3 0 0 3 1 1
4 1 13 4 2 0 2 3 3
5 1 13 6 1 1 1 3 1
6 1 13 7 2 2 2 1 1
7 1006 14 0 0 1 6 5 5
8 1006 14 0 0 0 6 5 5
9 1006 14 1 0 0 6 4 6
10 1006 14 3 0 1 4 5 4
11 1006 14 4 1 1 4 5 4
12 1006 14 6 1 2 6 4 6
I add one more row for subject:1006
, to make first row is not 0,0
.
Code
df %>%
group_by(subject) %>%
mutate(key = max(which((timeLeft == 0 & timeRight ==0)))) %>%
slice(1:key)
subject block trial timeLeft timeRight stim1 stim2 Chosen key
<int> <int> <int> <int> <int> <int> <int> <int> <int>
1 1 13 0 0 0 2 1 2 3
2 1 13 1 0 1 3 2 2 3
3 1 13 3 0 0 3 1 1 3
4 1006 14 0 0 1 6 5 5 3
5 1006 14 0 0 0 6 5 5 3
6 1006 14 1 0 0 6 4 6 3
CodePudding user response:
You can filter for only rows that meet the condition and then group
data %>%
filter(timeLeft > 0 & timeRight > 0) %>%
group_by(subject, block)