Home > Blockchain >  How to create a new variable based on condition from different dataframe in R
How to create a new variable based on condition from different dataframe in R

Time:03-26

I have 2 data frames from an experiment. The 1st df reads a (roughly) continuous signal over 40 mins. There are 5 columns, 1:3 are binary - saying whether a button was pushed. The 4th column is a binary of if either from column 2 or 3 was pushed. The 5th column is an approximate time in seconds. Example from df below:

initiate left right l or r time
0 0 1 1 2.8225
0 0 1 1 2.82375
0 0 1 1 2.82500
0 0 1 1 2.82625
1 0 0 0 16.82000
1 0 0 0 16.82125

etc.

The 2nd data frame is session info where each row is a trial, usually 100-150 rows depending on the day. I have a column that marks trial start time and another column that marks trial end time in seconds. Example from df below (I omitted several irrelevant columns):

trial success t start t end
1 0 16.64709 35.49431
2 1 41.81843 57.74304
3 0 65.54510 71.16612
4 0 82.65743 87.30914

etc.

For the 1st data frame, I want to create a column that indicates whether or not the button was pushed within a trial. This is based on those start and end times in the 2nd df. I would like it to look something like this (iti = inter-trial, wt = within trial):

initiate left right l or r time trial
0 0 1 1 2.8225 iti
0 0 1 1 2.82375 iti
0 0 1 1 2.82500 iti
0 0 1 1 2.82625 iti
1 0 0 0 16.82000 wt
1 0 0 0 16.82125 wt

etc.

I had the idea to do something like this, but I don't have a grouping variable between the 2 data frames so it doesn't work:

df2 %>% 
  full_join(df1, by = "trial") %>% 
    mutate(in_iti = case_when(time < tstart & time > tend ~ "iti",
                              time > tstart & time < tend ~ "within_trial"))

Any ideas on how to label the rows in df1 based on the time condition from the df2?

Thank you!

CodePudding user response:

Maybe try the following, if you data is relatively small, with dplyr. Assuming names of data.frames of df and df2. Using mutate to create your new column, and ifelse comparing each time in the first data.frame with t_start and t_end in your second data.frame.

library(dplyr) 

df %>%
  rowwise() %>%
  mutate(trial = ifelse(any(time > df2$t_start & time < df2$t_end), "wt", "iti"))

Output

  initiate  left right l_or_r  time trial
     <int> <int> <int>  <int> <dbl> <chr>
1        0     0     1      1  2.82 iti  
2        0     0     1      1  2.82 iti  
3        0     0     1      1  2.82 iti  
4        0     0     1      1  2.83 iti  
5        1     0     0      0 16.8  wt   
6        1     0     0      0 16.8  wt 
  • Related