Home > database >  Counting based on partial strings in R
Counting based on partial strings in R

Time:02-12

I'm working with some survey data collected from a google form. One of the questions allowed for respondents to provide multiple answers:

Data:

id_number <- c("101", "102", "103", "104", "105", "106")

why_join_program <- c("college assistance", 
                     "college assistance, fasfa support", 
                     "fasfa support, employment support", 
                     "college assistance, fasfa support, employment support", 
                     "college assistance, employment support",
                     "fasfa support")


df <- data.frame(id_number, why_join_program)

I have two questions:

  1. I want to count how many responses include a single answer (i.e., how many respondents identified "college support"?

  2. Can you provide a good way to organize this into a table? group_by and summarize() creates a table that counts every combination of responses. I want to create a table broken down by response and count the number of respondents who indicated that answer. This table would count respondents multiple times if they indicated multiple answers.

Thank you.

CodePudding user response:

Using strsplit and table.

strsplit(df$why_join_program, ', ') |> unlist() |> table()
# college assistance employment support      fasfa support 
#                  4                  3                  4 

Note: R >= 4.1 used

  •  Tags:  
  • r
  • Related