Home > Software design >  Create new variable based on first row by group, based on condition and minimum timestamp in R
Create new variable based on first row by group, based on condition and minimum timestamp in R

Time:11-19

I want to create a new variable that tells me whether a particular row fits the pattern: event = A and timestamp = minimum by group, to identify whether a row for each participant id was the first one for event A.

This is a sample dataset that I'm working with:

participant_id <- c("ps1", "ps1", "ps1", "ps1", "ps2", "ps2", "ps3", "ps3", "ps3", "ps3")
timestamp <- c(0.01, 0.02, 0.03, 0.04, 0.01, 0.02, 0.01, 0.02, 0.03, 0.04)
event <- c("A", "A", "A", "B", "B", "A", "A", "A", "B", "A")
data.frame(participant_id, timestamp, event)

Note. The data does not necessarily appear in ascending order.

And this is what I would like to end up with:

participant_id timestamp event first_A_row
ps1 0.01 A TRUE
ps1 0.02 A FALSE
ps1 0.03 A FALSE
ps1 0.04 B FALSE
ps2 0.01 B FALSE
ps2 0.02 A TRUE
ps3 0.01 A TRUE
ps3 0.02 A FALSE
ps3 0.03 B FALSE
ps3 0.04 A FALSE

CodePudding user response:

We may need to subset the 'timestamp' for the 'event' 'A' after grouping by 'participant_id' and create the logical

library(dplyr)
df1 %>% 
    group_by(participant_id) %>% 
    mutate(first_A_row = timestamp == min(timestamp[event == 'A'])) %>%
    ungroup

-output

# A tibble: 10 × 4
   participant_id timestamp event first_A_row
   <chr>              <dbl> <chr> <lgl>      
 1 ps1                 0.01 A     TRUE       
 2 ps1                 0.02 A     FALSE      
 3 ps1                 0.03 A     FALSE      
 4 ps1                 0.04 B     FALSE      
 5 ps2                 0.01 B     FALSE      
 6 ps2                 0.02 A     TRUE       
 7 ps3                 0.01 A     TRUE       
 8 ps3                 0.02 A     FALSE      
 9 ps3                 0.03 B     FALSE      
10 ps3                 0.04 A     FALSE      

data

df1 <- data.frame(participant_id, timestamp, event)
  • Related