Home > Net >  Analysis of many attendance lists
Analysis of many attendance lists

Time:10-14

I have 8 attendace lists from 8 different conferences. I need to know what persons assisted to at least 7 of the 8 conferences. I don't want to do it checking name by name in each list, so I'm planning to do it using R, but I have no clue about it. Any suggestions?

CodePudding user response:

Might be a more simple way (my R is getting a bit rusty), but this works:

library(dplyr)
unique_attendees <- c('a', 'b', 'c', 'd', 'e')

conf1_attendees <- c('a','b')
conf2_attendees <- c('a','b','c')
conf3_attendees <- c('a','b','c','e')
conf4_attendees <- c('b', 'e')
conf5_attendees <- c('a','d', 'e')
conf6_attendees <- c('a','d', 'e')
conf7_attendees <- c('a','b', 'e')
conf8_attendees <- c('a','b', 'c')

conferences <- list(conf1_attendees, conf2_attendees, conf3_attendees, conf4_attendees, conf5_attendees, conf6_attendees, conf7_attendees,conf8_attendees)

attendance_record <- dplyr::bind_rows(lapply(unique_attendees, function(x){
  cat(c('Working with: ', x, '\n'))
  attendance <- lapply(conferences, function(y){
    attended <- grepl(x, y)
    return(attended)
  })
  number_attended = length(which(unlist(attendance) == TRUE))
  result <- data.frame(person=x, number_attended=number_attended)
}))

result <- attendance_record %>% 
  mutate(attended_at_least_7 = data.table::fifelse(number_attended >= 7, TRUE, FALSE))

print(result)

Output:

  person number_attended attended_at_least_7
1      a               7                TRUE
2      b               6               FALSE
3      c               3               FALSE
4      d               2               FALSE
5      e               5               FALSE

Obviously you'll need to adapt it to your problem since we don't know how your records are stored.

  • Related