I have 8 attendace lists from 8 different conferences. I need to know what persons assisted to at least 7 of the 8 conferences. I don't want to do it checking name by name in each list, so I'm planning to do it using R, but I have no clue about it. Any suggestions?
CodePudding user response:
Might be a more simple way (my R is getting a bit rusty), but this works:
library(dplyr)
unique_attendees <- c('a', 'b', 'c', 'd', 'e')
conf1_attendees <- c('a','b')
conf2_attendees <- c('a','b','c')
conf3_attendees <- c('a','b','c','e')
conf4_attendees <- c('b', 'e')
conf5_attendees <- c('a','d', 'e')
conf6_attendees <- c('a','d', 'e')
conf7_attendees <- c('a','b', 'e')
conf8_attendees <- c('a','b', 'c')
conferences <- list(conf1_attendees, conf2_attendees, conf3_attendees, conf4_attendees, conf5_attendees, conf6_attendees, conf7_attendees,conf8_attendees)
attendance_record <- dplyr::bind_rows(lapply(unique_attendees, function(x){
cat(c('Working with: ', x, '\n'))
attendance <- lapply(conferences, function(y){
attended <- grepl(x, y)
return(attended)
})
number_attended = length(which(unlist(attendance) == TRUE))
result <- data.frame(person=x, number_attended=number_attended)
}))
result <- attendance_record %>%
mutate(attended_at_least_7 = data.table::fifelse(number_attended >= 7, TRUE, FALSE))
print(result)
Output:
person number_attended attended_at_least_7
1 a 7 TRUE
2 b 6 FALSE
3 c 3 FALSE
4 d 2 FALSE
5 e 5 FALSE
Obviously you'll need to adapt it to your problem since we don't know how your records are stored.