Home > Back-end >  R Frequency table with multiple choice question that allows more than one response
R Frequency table with multiple choice question that allows more than one response

Time:09-20

I am analyzing survey data, where people could choose more than one county in a question about where their organization is located. I am trying to create a frequency table that counts every time a county is chosen, whether or not they choose one or multiple counties.

Example of data:

df <- data.frame(org = c("org_1", "org_2", "org 3", "org 4"),
             county = c("A, B", "A, D", "B, C", "B"))

Here is the output I would like

output <- data.frame(county = c("A", "B", "C", "D"),
                 frequency = c(2, 3, 1, 1))

I've tried to use some of the standard frequency table options, such as table(df$county), but this counts "A, B", "A, D", and "B, C" each as unique values, rather than seeing "A", "B", "C", and "D" as individual values.

CodePudding user response:

Use separate_rows to split the column and get the frequency with count

library(tidyr)
library(dplyr)
df %>% 
  separate_rows(county) %>% 
  count(county, name = 'frequency')

-output

# A tibble: 4 × 2
  county frequency
  <chr>      <int>
1 A              2
2 B              3
3 C              1
4 D              1
  • Related